The McGraw Ht/I Compomes “‘” ~
INTRODUCTION TO COMPUTING SYSTEMS: FROM BITS AND GATES TO C AND BEYOND SECOND EDITION
International Edition 2005
Exclusive rights by McGraw-Hill Education (Asia), for manufacture and export. This book cannot be re-exported from the country to which it is sold by McGraw-Hill. The International Edition is not available in North America.
Published by McGraw-Hill, a business unit of The McGraw-Hill Companies, Inc., 1221 Avenue of the Americas, New York, NY l0020. Copyright© 2004, 2001 by The McGraw-Hill Companies, Inc. All rights reserved. No part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written consent of The McGraw- Hill Companies, Inc., including, but not limited to, in any network or other electronic storage or transmission, or broadcast for distance learning.
Some ancillaries, including electronic and print components, may not be available to customers outside the United States.
10 09 08 07 06 05 04 03 02 01 20 09 08 07 06 05 04
CTF SEP
Cover images: ©Photodisc, AA048376 Green Abstract, AA003317 Circuit Board Detail
Library o f Congress Control Number: 2003051002
When ordering this title, use ISBN 007-124501–4
Printed in Singapore
> > second edition
“1ntroduction to
· computing s~stems
Yale N. Patt
The University of Texas at Austin
Sanjay J. Patel
University of Illinois at Urbana-Champaign
R Higher Education
Boston Burr Ridge, IL Dubuque, IA Madison, WI New York San Francisco Bangkok Bogota Caracas Kuala Lumpur Lisbon London Madrid Mexico City Milan Montreal New Delhi Santiago Seoul Singapore Sydney Taipei Toronto
from bits and gates to c and beqond
St. Louis
To the memory of my parents,
Abraham Walter Patt A”H and Sarah Clara Patt A”H, who taught me to value “learning”
even before they taught me to ride a bicycle.
To Mira and her grandparents, Sharda Patel and Jeram Patel.
Preface xi
Preface to the First Edition xvii
2.4.1 Binary to Decimal Conversion
2.4.2 Decimal to Binary Conversion
2.5 _ Operations on Bits~Part I: Arithmetic
2.6
2.7
27 28
29
1 Welcome Aboard 1
Addition and Subtraction 29 Sign-Extension 30
Overftow 31
1.1 What We Will Try to Do
1.2 How We Will Get There
1.3 Two Recurring Themes
1 2
3
1.3.1 The Notion of Abstraction 3
2.6.1,, The AND Function 33
2.6.2 The OR Function 34
2.6.3 The NOT Function 35
2.6.4 The Exclusive-OR Function 35
Other Representations 36
2.7.1 The Bit Vector 36
2.7.2 Floating Point Data Type 37
2.7.3 ASCII Codes 40
2.7.4 Hexadecimal Notation
1.3.2 Hardware versus Software 5
1.4 A Computer System 7
1.5 Two Very Important Ideas 9
1.6 Computers as Universal Computational
Devices 9
1.7 How Do We Get the Electrons to Do the
Work? 12
1.7.1 The Statement of the Problem
1.7.2 The Algorithm 13
1.7.3 The Program 14
1.7.4 The ISA 14
1.7.5 The Microarchitecture
13
41
51
53 56
59
Decoder 59
Mux 60
Full Adder 61
The Programmable Logic Array (PLA) 63
3.3.5
Basic Storage Elements 64
3.4.1 The R-S Latch 64
3.4.2 The Gated D Latch 66
3.4.3 A Register 66
1.7.6 The Logic Circuit
1.7.7 The Devices 16
1.7.8 Putting It Together
Exercises 17
2 Bits, Data Types, and Operations 21
16 16
3.1 3.2
3.3
3 . 4
The Transistor 51 Logic Gates 53
The NOT Gate (Inverter) OR and NOR Gates 54 AND and NANO Gates DeMorgan’s Law 58 Larger Gates 58
2.1 Bits and Data Types 21
2.1.1 The Bit as the Unit of
Information 21
2.1.2 Data Types 22
2.2 Integer Data Types 23
2.2.1 Unsigned Integers 23
2.2.2 Signed Integers 23
3.3.1 3.3.2 3.3.3 3.3.4
2.3 2’s Complement Integers 25
2.4 Binary-Decimal .Conversion 27
15
Exercises 43
3 Digital Logic Structures
2.5.1
2.5.2
2.5.3
Operations on Bits~Part II: Logical Operations 33
3.2.1
3.2.2
3.2.3
3.2.4
3.2.5
Combinational Logic Circuits
Logical Completeness 64
contents
vi
3.5
3.6
Contents
The Concept of Memory 67
3.5.1 Address Space 68
3.5.2 Addressability 68
3.5.3 A 22 -by-3-Bit Memory
Sequential Logic Circuits 70
3.6.1 A Simple Example: The Combination
5.4
5.5 5.6
Control Instructions 130
Lock 71
3.6.2 The Concept of State
3.6.3 Finite State Machines
3.6.4 An Example: The Complete
5.4.5
5.4.6
Another Example: Counting Occurrences of a Character 138
The Data Path Revisited 141
Implementation of a
Finite State Machine The Data Path of the LC-3
4 The von Neumann Model
4.1 Basic Components 97
4.1.1 Memory 98
4.1.2 Processing Unit 99
4.1.3 Input and Output 100
4.1.4 Control Unit 100
Basic Components of the Data Path 141
The Instruction Cycle 144
6 Programming 155
3.7
Exercises 82
5.6.2 Exercises 145
4.2 The LC-3: An Example von Neumann Machine 101
4.3 Instruction Processing 103
4.3.1 The Instruction 103
4.3.2 The Instruction Cycle 104
4.4 Changing the Sequence of Execution 4.4. l Control of the Instruction
Cycle 108
4.5 Stopping the Computer 110
6.1
6.2
Problem Solving 155
6.1.1 Systematic Decomposition 155
6.1.2 The Three Constructs: Sequential,
Conditional, Iterative 156
6.1.3 LC-3 Control Instructions to
Implement the Three
Constructs 157
6.1.4 The Character Count Example from
Chapter 5, Revisited 158
Debugging 1 6 2
6.2.1 Debugging Operations 163
6.2.2 Examples: Use of the Interactive
Exercises 1 1 1
5 The LC-3 115
Debugger 164 Exercises 172
7 Assembly Language 177
5.1
5.2 5.3
The ISA: Overview
5.1.l Memory Organization 116
5.1.2 Registers 1 1 6
5.1.3 The Instruction Set
5.1.4 Opcodes 1 1 7
5.1.5 Data Types 118
5.1.6 Addressing Modes
5.1.7 Condition Codes
Operate Instructions 120 Data Movement Instructions
5.3.l PC-Relative Mode
5.3.2 Indirect Mode 125
5.3.3 Base+offset Mode
5.3.4 Immediate Mode
5.3.5 An Example 129
115
7.1 7.2
7.3
Assembly Language Programming- Moving Up a Level 177
An Assembly Language Program 178 7.2.1 Instructions 179
7.2.2 Pseudo-ops (Assembler Directives) 182
7.2.3 Example: The Character Count Example of Section 5.5,
Revisited 183
The Assembly Process 185
7.3.1 Introduction 185
118 120
72 74
77 80
5.6. l
117
123 124
127 128
7.3.2 7.3.3
7.3.4
A Two-Pass Process 185
The First Pass: Creating the Symbol Table 186
The Second Pass: Generating the Machine Language Program 187
68
5.4.l 5.4.2 5.4.3 5.4.4
Conditional Branches 131
An Example 132
Two Methods for Loop Control 135 Example: Adding a Column of Numbers Using a Sentinel 135
The JMP Instruction 136
The TRAP Instruction 137
97
107
7.4 Beyond the Assembly of a Single Assembly Language Program 188
9.1.5 TRAP Routines for Handling 1/0 225
9.1.6 TRAP Routine for Halting the Computer 225
9.1.7 Saving and Restoring Registers 229
Subroutines 230
7.4.1
7.4.2 Exercises
The Executable Image 189
8 1/0 199
230
More than One Object File 190
189
8.1 1/0 Basics 199
8.1.1 Device Registers 199
8.1.2 Memory-Mapped 1/0 versus Special
Input/Output Instructions 200
8.1.3 Asynchronous versus
Synchronous 200
8.1.4 Interrupt-Driven versus Polling 202
8.2 Input from the Keyboard 202
8.2.1 Basic Input Registers (the KBDR and
9.2.5 Exercises 240
the KBSRl 202
8.2.2 The Basic Input Service 10.1
Routine 202
8.2.3 Implementation of Memory-Mapped
Input 203
8.3 Output to the Monitor 204
8.3.1 8.3.2 8.3.3
Type 251
10.1.2 Two Example Implementations 252 10.1.3 Implementation in Memory 253 10.1.4 The Complete Picture 257 Interrupt-Driven 1/0 (Part 2) 258
10.2.1 Initiate and Service the
Interrupt 259
10.2.2 Return from the Interrupt 261 10.2.3 An Example 262
Arithmetic Using a Stack 264
10.3.l The Stack as Temporary
Storage 264
10.3.2 An Example 265
10.3.3 OpAdd, OpMult, and OpNeg 265
10.4 Data Type Conversion 272
10.4.l Example: The Bogus Program: 2+3=e 272
10.4.2 ASCII to Binary 273
10.4.3 Binary to ASCII 276
10.5 Our Final Example: The Calculator 278 Exercises 283
11 Introduction to Programming inC 289
11.1 Our Objective 289
11.2 Bridging the Gap 290
11.3 Translating High-Level Language
Programs 292
Basic Output Registers (the DOR and
theDSR) 204 10.2 The Basic Output Service
Routine 205
Implementation of Memory-Mapped
Output 206
Example: Keyboard Echo 207 10.3
8.3.4
8.4 A More Sophisticated Input Routine 207
8.5 Interrupt-Driven 1/0 209
8.5.1 What Is Interrupt-Driven 1/0? 209
8.5.2 Why Have Interrupt-Driven
J/0? 210
8.5.3 Generation of the Interrupt
Signal 211
8.6 Implementation of Memory-Mapped I/0,
Revisited 214 Exercises 2 1 5
9 TRAP Routines and Subroutines 219
9.1 LC-3 9.1.1 9.1.2 9.1.3 9.1.4
TRAP Routines 219 Introduction 219
The TRAP Mechanism
The TRAP Instruction
The Complete Mechanism 222
220 221
9.2
9.2.1 9.2.2 9.2.3
9.2.4
The Call/Return Mechanism
The JSR(Rl Instruction 232
The TRAP Routine for Character Input, Revisited 233
PUTS: Writing a Character String to the Monitor 235
Library Routines 235
10 And, Finally … The Stack 251 The Stack: Its Basic Structure 251 10.1.1 The Stack-An Abstract Data
Contents
vii
viii
Contents
11.4
11.5
11.3.1 Interpretation 292 11.3.2 Compilation 293 11.3.3 Pros and Cons 293 The C Programming Language 11.4.1 The C Compiler 295 A Simple Example 297
11.5.1 The Function main
11.5.2 Formatting, Comments, and
13 Control Structures 343
13.1 Introduction 343
13.2 Conditional Constructs 3 4 4
13.2.1 The if Statement 344
13.2.2 The if-else Statement 347
13.3 Iteration Constructs 350
13.3.1 The while Statement 350
13.3.2 The for Statement 353
13.3.3 The do-while Statement 358
13.4 Problem Solving Using Control Structures 359
13.4.1 Problem 1: Approximating the Value of ,r 360
13.4.2 Problem 2: Finding Prime Numbers Less than 100 362
13.4.3 Problem 3: Analyzing an E-mail Address 3 6 6
13.5 Additional C Control Structures 368
13.5.1 The switch Statement 368
13.5.2 The break and continue
Statements 3 7 0
13.5.3 An Example: Simple
Calculator 370
13.6 Summary 372
297
Style 2 9 9
11.5.3 The C Preprocessor 300
11.5.4 Input and Output
Summary 304
12 Variables and Operators
12.1 Introduction 307
12.2 Variables 308
11.6
Exercises 3 0 5
12.2.1 Three Basic Data Types: int, char, double 308
12.2.2 Choosing Identifiers 310
12.2.3 Scope: Local versus Global 311
12.2.4 More Examples 313
12.3 Operators 314
12.3.1 Expressions and Statements 315
12.3.2 The Assignment Operator 316
12.3.3 Arithmetic Operators 317
12.3.4 Order of Evaluation
12.3.5 Bitwise Operators 319
12.3.6 Relational Operators 320
12.3.7 Logical Operators 322
12.3.8 Increment/Decrement
Operators 3 2 2
12.3.9 Expressions with Multiple
Operators 3 2 4
12.4 Problem Solving Using Operators 324
12.5 Tying it All Together 326
12.5.1 Symbol Table 326
12.5.2 Allocating Space for Variables 328
12.5.3 A Comprehensive Example 3 3 1
12.6 Additional Topics 332
12.6.1 Variations of the Three Basic Types 332
12.6.2 Literals, Constants, and Symbolic Values 334
Exercises
3 7 2
12.6.3 Storage Class 335
12.6.4 Additional C Operators
12.7 Summary 337
Exercises 338
14.5 Summary 398 Exercises 3 9 9
15 Testing and Debugging 407
301
318
14 Functions 379
14.1 Introduction 379
14.2 Functions in C 380
14.2.1 A Function with a Parameter 380
14.2.2 Example: Area of a Ring 384
14.3 Implementing Functions in C 385
293
336
15.1 15.2
Introduction 407
Types of Errors 408
15.2.1 Syntactic Errors 409
307
14.3.1 14.3.2 14.3.3
Run-Time Stack 385 Getting It All to Work 388 Tying It All Together 393
14.4 Problem Solving Using Functions 394
14.4.1 14.4.2
Problem 1: Case Conversion 395 Problem 2: Pythagorean
Triples 397
15.3
15.4
15.5
15.2.3 Algorithmic Errors Testing 412
15.3.1 Black-Box Testing 15.3.2 White-Box Testing Debugging 414
411
412 413
15.2.2 Semantic Errors 409
15.4.1 Ad Hoc Techniques 414 15.4.2 Source-Level Debuggers 415 Programming for Correctness 417
15.5.1 Nailing Down the
Specification 417
15.5.2 Modular Design 418
15.5.3 Defensive Programming 418
427
15.6
Exercises 421
Summary 419
16 Pointers and Arrays
16.1 Introduction 427 16.2 Pointers 428
16.2.1 Declaring Pointer Variables
16.2.2 Pointer Operators 430
16.2.3 Passing a Reference Using
Pointers 432
16.2.4 Null Pointers 433
16.2.5 Demystifying the Syntax
16.2.6 An Example Problem Involving
Pointers 434 16.3 Arrays 436
16.3.1
16.3.2 16.3.3 16.3.4 16.3.5
16.3.6
16.3.7
Declaring and Using Arrays 436
Examples Using Arrays Arrays as Parameters Strings in C 441
438 440
inC 449 16.4 Summary 451
Exercises 451
17 Recursion 457
17.1 Introduction 457
A.l Overview 521
A.2 Notation 523
A.3 The Instruction Set 523
A.4 Interrupt and Exception Processing 543
A.4.1 Interrupts 54 3 A.4.2 Exceptions 544
17.2 17.3 17.4
What Is Recursion? 458 Recursion versus Iteration 459 Towers of Hanoi 460
The Relationship Between Arrays and Pointers in C 446
Problem Solving: Insertion
Sort 446
Common Pitfalls with Arrays
506
434
429
Contents
17.5 Fibonacci Numbers 464
17.6 Binary Search 468
17.7 Integer to ASCII 471
17.8 Summary 473
Exercises 473
181/0inC 481
18.1 Introduction 481
18.2 The C Standard Library 481 18.3 I/0, One Character at a Time 482
18.3.1 1/0 Streams 482 18.3.2 putchar 483 18.3.3 getchar 483 18.3.4 Buffered I/0 483
18.4 Formatted 1/0 485 18.4.1 printf 485
18.4.2 scanf 487 18.4.3 Variable Argument
Lists 489 18.5 1/0 from Files 491
18.6 Summary 493 Exercises 494
19 Data Structures 497
19.1 Introduction 497 19.2 Structures 498
19.2.1 typedef 500
19.2.2 Implementing Structures
inC 501
19.3 Arrays of Structures 502
19.4 Dynamic Memory Allocation 504 19.4.1 Dynamically Sized Arrays
19.5 Linked Lists 508
19.5.1 An Example 510
19.6 Summary 516 Exercises 517
A The LC-3 ISA 521
ix
X
Contents
B From LC-3 to x86 547
D.4
D.5
Declarations 595
D.4.1 Variable Declarations 595
D.4.2 Function Declarations 5%
Operators 596
B.l
B.2
B.3
LC-3 Features and Corresponding x86 Features 548
C.l C.2 C.3 C.4 C.5 C.6
C.7
Overview 565
The State Machine
The Data Path 569
The Control Structure Memory-Mapped I/0 575 Interrupt and Exception Control
C.6.1 Initiating an Interrupt 579
C.6.2 Returning from an Interrupt,
RT! 581
C.6.3 The Illegal Opcode Exception
Control Store 583
B.1.1
B.1.2
B.1.3
The Format and Specification of x86 Instructions 557
B.2.1 Prefix 558
B.2.2 Opcode 559
B.2.3 ModR/M Byte 559
B.2.4 SIB Byte 560
B.2.5 Displacement 560
B.2.6 Immediate 560
An Example 562
Instruction Set Memory 553 Internal State
548 553
D.5.1
D.5.2
D.5.3
D.5.4
D.5.5
D.5.6 IncremenVDecrement
Operators 599
D.5.7 Conditional Expression 600 D.5.8 Pointer, Array1 and Structure
Operators 600
D.5.9 sizeof 601
D.5.10 Order of Evaluation 602
D.5.11 Type Conversions 602
Expressions and Statements 603 D.6.1 Expressions 603
D.6.2 Statements 604 Control 604
D.7.1 If 604
D.7.2 If-else 605 D.7.3 Switch 605 D.7.4 While 606 D.7.5 For 607
D.7.6 Do-while 607 D.7.7 Break 608 D.7.8 continue 608 D.7.9 return 609 The C Preprocessor 609
D.8.1 Macro substitution 609
D.8.2 File inclusion 610
Some Standard Library Functions 610
C The Microarchitecture of the LC-3 565
D.6
D.7
D The C Programming Language 585
D.l Overview 585
D.2 C Conventions 585
D.8
D.9
D.2.1 D.2.2 D.2.3 D.2.4 D.2.5 D.2.6
D.3 Types D.3.1 D.3.2 D.3.3 D.3.4 D.3.5
Source Files 585 Header Files 585 Comments 586 Literals 586 Formatting 588 Keywords 588 589
Basic Data Types 589
Derived Types typedef 594
E Useful Tables 615
Type Qualifiers Storage Class 591
567 569
590 592
F Solutions to Selected Exercises 619
576
582
E.l E.2 E.3
Commonly Used Numerical Prefixes 615 Standard ASCII codes 616
Powers of 2 617
D.9.1 D.9.2 D.9.3 D.9.4
I/0 Functions 611 String Functions 612 Math Functions 613 Utility Functions 613
Assignment Operators 597
Arithmetic Operators Bit-wise Operators
Logical Operators 598 Relational Operators 599
597 598
It is a pleasure to be writing a preface to the second edition of this book. Three years have passed since the first edition came out. We have received an enormous number of comments from students who have studied the material in the book and from instructors who have taught from it. Almost all have been very positive. It is gratifying to know that a lot of people agree with our approach, and that this agreement is based on real firsthand experience learning from it (in the case of students) or watching students learn from it (in the case of instructors). The excitement displayed in their e-mail continues to be a high for us.
However, as we said in the preface to the first edition, this book will always be a “work in progress.” Along with the accolades, we have received some good advice on how to make it better. We thank you for that. We have also each taught the course two more times since the first edition came out, and that, too, has improved our insights into what we think we did right and what needed improvement. The result has been a lot of changes in the second edition, while hopefully maintaining the essence of what we had before. How well we have succeeded we hope to soon learn from you.
Major Changes to the First Edition
The LC-3
One of the more obvious changes in the second edition is the replacement of the LC-2 with the LC-3. We insisted on keeping the basic concept of the LC-2: a rich ISA that can be described in a few pages, and hopefully mastered in a short time. We kept the 16-bit instruction and 4-bit opcode. One of our students pointed out that the subroutine return instruction (RET) was just a special case of LC-2’s JMPR instruction, so we eliminated RET as a separate opcode. The LC-3 specifies only 15 opcodes-and leaves one for future use (perhaps, the third edition!).
We received a lot of push-back on the PC-concatenate addressing mode, particularly for branches. The addressing mode had its roots in the old PDP-8 of the mid-1960s. A major problem with it comes up when an instruction on one page wants to dereference the next (or previous) page. This has been a major hassle, particularly for forward branches close to a page boundary. A lot of people have asked us to use the more modem PC+offset, and we agreed. We have replaced all uses of PC’offset with PC+SEXT(offset).
We incorporated other changes in the LC-3. Stacks now grow toward 0, in keeping with current conventional practice. The offset in LDR/STR is now
preface
xii
preface
a signed value, so addresses can be computed plus or minus a base address. The opcode 1101 is not specified. The JSR/JMP opcodes have been reorganized slightly. Finally, we expanded the condition codes to a J6-,bit processor status register (PSR) that includes a privilege mode and a priority level. As in the first edition, Appendix A specifies the LC-3 completely.
Additional Material
Although no chapter in the book has remained untouched, some chapters have been changed more than others. We added discussions to Chapter I on the nature and importance of abstraction and the interplay of hardware and software because it became clear that these points needed to be made explicit. We added a full section to Chapter 3 on finite state control and its implementation as a sequential switching circuit because we believe the concept of state and finite state control are among the most important concepts a computer science or engineering student encounters. We feel it is also useful to the understanding of the von Neumann model of execution discussed in Chapter 4. We added a section to Chapter 4 giving a glimpse of the underlying microarchitecture of the LC-3, which is spelled out in all its detail in the overhauled Appendix C. We were told by more than one reader that Chapter 5 was too terse. We added little new material, but lots of figures and explanations that hopefully make the concepts clearer. We also added major new sections on interrupt-driven 1/0 to Chapters 8 and 10.
Just as in the first edition, Chapters 11 through 14 introduce the C program- ming language. Unlike the first edition, these chapters are more focused on the essential aspects of the language useful to a beginning programmer. Special- ized features, for example the C switch construct, are relegated to the ends of the chapters (or to Appendix D), out of the main line of the text. All of these chapters include more examples than the first edition. The second edition also places a heavier emphasis on “how to program” via problem-solving examples that demonstrate how newly introduced C constructs can be used in C program- ming. In Chapter 14, students are exposed to a new LC-3 calling convention that more closely reflects the calling convention used by real systems. Chapter 15 contains a deeper treatment of testing and debugging. Based on our experiences teaching the introductory course, we have decided to swap the order of the chapter on recursion with the chapter on pointers and arrays. Moving recursion later (now Chapter 17) in the order of treatment allows students to gain more experience with basic programming concepts before they start programming recursive functions.
The Simulator
Brian Hartman has updated the simulator that runs on Windows to incorporate the changes to the LC-3. Ashley Wise has written an LC-3 simulator that runs on UNIX. Both have incorporated interrupt-driven 1/0 into the simulator’s function- ality. We believe strongly that there is no substitute for hands-on practice testing one’s knowledge. With the addition of interrupt-driven I/0 to the simulator, the student can now interrupt an executing program by typing a key on the keyboard and invoke an interrupt service routine.
Alternate Uses of the Book
We wrote the book as a textbook for a freshman introduction to computing. We strongly believe, as stated more completely in the preface to our first edition, that our motivated bottom-up approach is the best way for students to learn the fundamentals of computing. We have seen lots of evidence that suggests that in general, students who understand the fundamentals of how the computer works are better able to grasp the stuff that they encounter later, including the high-level programming languages that they must work in, and that they can learn the rules of these programming languages with far less memorizing because everything makes sense. For us, the best use of the book is a one-semester freshman course for particularly motivated students, or a two-semester sequence where the pace is tempered. If you choose to go the route of a one-semester course heavy on high-level language programming, you probably want to leave out the material on sequential machines and interrupt-driven I/O. If you choose to go the one- semester route heavy on the first half of the book, you probably want to leave out much of Chapters 15, 17, 18, and 19.
We have also seen the book used effectively in each of the following environments:
Two Quarters, Freshman Course
In some sense this is the best use of the book. In the first quarter, Chapters I through 10 are covered; in the second quarter, Chapters 11 through 19. The pace is brisk, but the entire hook can be covered in two academic quarters.
One-Semester Second Course
The book has been used successfully as a second course in computing, after the student has spent the first course with a high-level programming language. The rationale is that after exposure to high-level language programming in the first course, the second course should treat at an introductory level digital logic, basic computer organization, and assembly language programming. Most of the semester is spent on Chapters I through I0, with the last few weeks spent on a few topics from Chapters 11 through 19, showing how some of the magic from the students’ first course can actually be implemented. Functions, activation records, recursion, pointer variables, and some elementary data structures are typically the topics that get covered.
A Sophomore-Level Computer Organization Course
The book has been used to delve deeply into computer implementation in the sophomore year. The semester is spent in Chapters I through I0, sometimes culminating in a thorough study of Appendix C, which provides the complete microarchitecture of a microprogrammed LC-3. We note, however, that some very important ideas in computer architecture are not covered here, most notably cache memory, pipelining, and virtual memory. We agree that these topics are very important to the education of a computer scientist or computer engineer, but we feel these topics are better suited to a senior course in computer architecture and design. This book is not intended for that purpose.
preface xiii
xiv
preface
Rcknowledgments
Our book continues to benefit greatly from important contributions of many, many people. We particularly want to acknowledge Brian Hartman and Matt Starolis.
Brian Hartman continues to be a very important part of this work, both for the great positive energy he brings to the table and for his technical expertise. He is now out of school more than three years and remains committed to the concept. He took the course the first year it was offered at Michigan (Winter term, 1996), TAed it several times as an undergraduate student, and wrote the first LC-2 simulator for Windows while he was working on his master’s degree. He recently upgraded the Windows simulator to incorporate the new LC-3.
Matt Starolis took the freshman course at UT two years ago and TAed it as a junior last fall. He, too, has been very important to us getting out this second edition. He has been both critic of our writing and helpful designer of many of the figures. He also updated the tutorials for the simulators, which was necessary in order to incorporate the new characteristics of the LC-3. When something needed to be done, Matt volunteered to do it. His enthusiasm for the course and the book has been a pleasure.
With more than 100 adopters now, we regularly get enthusiastic e-mail with suggestions from professors from all over the world. Although we realize we have undoubtedly forgotten some, we would at least like to thank Professors Vijay Pai, Rice; Richard Johnson, Western New Mexico; Tore Larsen, Tromso; Greg Byrd, NC State; Walid Najjar, UC Riverside; Sean Joyce, Heidelberg Col- lege; James Boettler, South Carolina State; Steven Zeltmann, Arkansas; Mike McGregor, Alberta; David Lilja, Minnesota; Eric Thompson, Colorado, Denver; and Brad Hutchings, Brigham Young.
Between the two of us, we have taught the course four more times since the first edition came out, and that has produced a new enthusiastic group of believ- ers, both TAs and students. Kathy Buckheit, Mustafa Erwa, Joseph Grzywacz, Chandresh Jain, Kevin Major, Onur Mutlu, Moinuddin Qureshi, Kapil Sachdeva, Russell Schreiber, Paroma Sen, Santhosh Srinath, Kameswar Subramaniam, David Thompson, Francis Tseng, Brian Ward, and Kevin Waley have all served as TAs and have demonstrated a commitment to helping students learn that can only be described as wonderful. Linda Bigelow, Matt Starolis, and Lester Guillory all took the course as freshmen, and two years later they were among the most enthusiastic TAs the course has known.
Ashley Wise developed the Linux version of the LC-3 simulator. Ajay Ladsaria ported the LCC compiler to generate LC-3 code. Gregory Muthler and Francesco Spadini enthusiastically provided critical feedback on drafts of the chapters in the second half. Brian Fahs provided solutions to the exercises.
Kathy Buckheit wrote introductory tutorials to help students use the LC-2 simulator because she felt it was necessary.
Several other faculty members at The University of Texas have used the book and shared their insights with us: Tony Ambler, Craig Chase, Mario Gonzalez, and Earl Swartzlander in ECE, and Doug Burger, Chris Edmundson, and Steve Keckler in CS. We thank them.
We continue to celebrate the commitment displayed by our editors, Betsy Jones and Michelle Flomenhoft.
As was the case with the first edition, our book has benefited from exten- sive reviews provided by faculty members from many universities. We thank Robert Crisp, Arkansas; Allen Tannenbaum, Georgia Tech; Nickolas Jovanovic, Arkansas-Little Rock; Dean Brock, North Carolina-Asheville; Amar Raheja, Cal State-Pomona; Dayton Clark, Brooklyn College; William Yurcik, Illinois State; Jose Delgado-Frias, Washington State; Peter Drexel, Plymouth State; Mahmoud Manzoul, Jackson State; Dan Connors, Colorado; Massoud Ghyam, Southern Cal; John Gray, UMass-Dartmouth; John Hamilton, Auburn; Alan Rosenthal, Toronto; and Ron Taylor, Wright State.
Finally, there are those who have contributed in many different and often unique ways. Without listing their individual contributions, we simply list them and say thank you. Amanda, Bryan, and Carissa Hwu, Mateo Valero, Rich Belgard, Janak Patel, Matthew Frank, Milena Milenkovic, Lila Rhoades, Bruce Shriver, Steve Lumetta, and Brian Evans. Sanjay would like to thank Ann Yeung for all her love and support.
nFinal Word
It is worth repeating our final words from the preface to the first edition: We are mindful that the current version of this book will always be a work in progress, and we welcome your comments on any aspect of it. You can reach us by e-mail at patt@ece.utexas.edu and sjp@crhc.uiuc.edu. We hope you will.
Yale N. Patt Sanjay J. Patel May, 2003
preface xv
preface to the first edition
This textbook has evolved from EECS I00, the first computing course for com- puter science, computer engineering, and electrical engineering majors at the University of Michigan, that Kevin Compton and the first author introduced for the first time in the fall term, 1995,
EECS 100 happened because Computer Science and Engineering faculty had been dissatisfied for many years with the lack of student comprehension of some very basic concepts. For example, students had a lot of trouble with pointer variables. Recursion seemed to be “magic,” beyond understanding.
We decided in 1993 that the conventional wisdom of starting with a high- level programming language, which was the way we (and most universities) were doing it, had its shortcomings. We decided that the reason students were not getting it was that they were forced to memorize technical details when they did not understand the basic underpinnings.
The result is the bottom-up approach taken in this book. We treat (in order) MOS transistors (very briefly, long enough for students to grasp their global switch-level behavior), logic gates, latches, logic structures (MlJX, Decoder, Adder, gated latches), finally culminating in an implementation of memory. From there, we move on to the Von Neumann model of execution, then a simple com- puter (the LC-2), machine language programming of the LC-2, assembly language programming of the LC-2, the high level language C, recursion, pointers, arrays, and finally some elementary data structures.
We do not endorse today’s popular information hiding approach when it comes to learning. Information hiding is a useful productivity enhancement tech- nique after one understands what is going on. But until one gets to that point, we insist that information hiding gets in the way of understanding. Thus, we contin- ually build on what has gone before, so that nothing is magic, and everything can be tied to the foundation that has already been laid.
We should point out that we do not disagree with the notion of top-down design. On the contrary, we believe strongly that top-down design is correct design. But there is a clear difference between how one approaches a design problem (after one understands the underlying building blocks), and what it takes to get to the point where one does understand the building blocks. In short, we believe in top-down design, but bottom-up learning for understanding.
What Is in the Hook
The book breaks down into two major segments, a) the underlying structure of a computer, as manifested in the LC-2; and b) programming in a high level language, in our case C.
The LC-2
We start with the underpinnings that are needed to understand the workings of a real computer. Chapter 2 introduces the bit and arithmetic and logical operations on bits, Then we begin to build the structure needed to understand the LC-2. Chapter 3 takes the student from a MOS transistor, step by step, to a real memory. Our real memory consists of 4 words of3 bits each, rather than 64 megabytes. The picture tits on a single page (Figure 3.20), making it easy for a student to grasp. By the time the students get there, they have been exposed to all the elements that make memory work. Chapter 4 introduces the Von Neumann execution model, as a lead-in to Chapter 5, the LC-2.
The LC-2 is a 16-bit architecture that includes physical I/0 via keyboard and monitor; TRAPs to the operating system for handling service calls; conditional branches on N, Z, and P condition codes; a subroutine call/return mechanism; a minimal set of operate instructions (ADD, AND, and NOT); and various address- ing modes for loads and stores (direct, indirect, Base+offset, and an immediate mode for loading effective addresses).
Chapter 6 is devoted to programming methodology (stepwise refinement) and debugging, and Chapter 7 is an introduction to assembly language programming. We have developed a simulator and an assembler for the LC-2. Actually, we have developed two simulators, one that runs on Windows platforms and one that runs on UNIX. The Windows simulator is available on the website and on the CD- ROM. Students who would rather use the UNIX version can download and install the software from the web at no charge.
Students use the simulator to test and debug programs written in LC-2 machine language and in LC-2 assembly language. The simulator allows online debugging (deposit, examine, single-step, set breakpoint, and so on). The sim- ulator can be used for simple LC-2 machine language and assembly language programming assignments, which are essential for students to master the concepts presented throughout the first 10 chapters.
Assembly language is taught, but not lo train expert assembly language pro- grammers. Indeed, if the purpose was to train assembly language programmers, the material would be presented in an upper-level course, not in an introductory course for freshmen. Rather, the material is presented in Chapter 7 because it is consistent with the paradigm of the book. In our bottom-up approach, by the time the student reaches Chapter 7. he/she can handle the process of transform- ing assembly language programs to sequences of Os and ls. We go through the process of assembly step-by-step for a very simple LC-2 Assembler. By hand assembling, the student (at a very small additional cost in time) reinforces the important fundamental concept of translation.
It is also the case that assembly language provides a user-friendly notation to describe machine instructions, something that is particularly useful for the
preface to the first edition xix
second half of the book. Starting in Chapter 11, when we teach the semantics of, C statements, it is far easier for the reader to deal with ADD RI, R2, R3 than with 00010010 I0000011.
Chapter 8 deals with physical input (from a keyboard) and output (to a mon- itor). Chapter 9 deals with TRAPs to the operating system, and subroutine calls and returns. Students study the operating system routines (written in LC-2 code) for carrying out physical I/0 invoked by the TRAP instruction.
The first half of the book concludes with Chapter I0, a treatment of stacks and data conversion at the LC-2 level, and a comprehensive example that makes use of both. The example is the simulation of a calculator, which is implemented by a main program and 11 subroutines.
The ~anguage C
From there, we move on to C. The C programming language occupies the second half of the book. By the time the student gets to C, he/she has an understanding of the layers below.
The C programming language fits very nicely with our bottom-up approach. Its low-level nature allows students to see clearly the connection between software and the underlying hardware. In this book we focus on basic concepts such as control structures, functions, and arrays. Once basic programming concepts are mastered, it is a short step for students to learn more advanced concepts such as objects and abstraction.
Each time a new construct in C is introduced, the student is shown the LC-2 code that a compiler would produce. We cover the basic constructs of C (vari- ables, operators, control, and functions), pointers, recursion, arrays, structures, I/0, complex data structures, and dynamic allocation.
Chapter 11 is a gentle introduction to high-level programming languages. At this point, students have dealt heavily with assembly language and can understand the motivation behind what high-level programming languages provide. Chapter
11 also contains a simple C program, which we use to kick-start the process of learning C.
Chapter 12 deals with values, variables, constants. and operators. Chapter 13 introduces C control structures. We provide many complete program examples to give students a sample of how each of, these concepts is used in practice. LC-2 code is used to demonstrate how each C construct affects the machine at the lower levels.
In Chapter 14, students are exposed to techniques for debugging high-level source code. Chapter 15 introduces functions in C. Students are not merely exposed to the syntax of functions. Rather they learn how functions are actually executed using a run-time stack. A number of examples are provided.
Chapter 16 teaches recursion, using the student’s newly gained knowledge of functions, activation records, and the run-time stack. Chapter 17 teaches pointers and arrays, relying heavily on the student’s understanding of how memory is organized. Chapter 18 introduces the details of I/0 functions in C, in particular,
xx
preface to the first edition
streams, variable length argument lists, and how C I/O is affected by the various format specifications. This chapter relies on the student’s earlier exposure to physical I/O in Chapter 8. Chapter 19 concludes the coverage of C with structures, dynamic memory allocation, and linked lists.
Along the way, we have tried to emphasize good programming style and coding methodology by means of examples. Novice programmers probably learn at least as much from the programming examples they read as from the rules they are forced to study. Insights that accompany these examples are highlighted by means of lightbulb icons that arc included in the margins.
We have found that the concept of pointer variables (Chapter 17) is not at all a problem. By the time students encounter it, they have a good understanding of what memory is all about, since they have analyzed the logic design of a small memory (Chapter 3). They know the difference, for example, between a memory location’s address and the data stored there.
Recursion ceases to be magic since, by the time a student gets to that point (Chapter 16), he/she has already encountered all the underpinnings. Students understand how stacks work at the machine level (Chapter 10), and they under- stand the call/return mechanism from their LC-2 machine language programming experience, and the need for linkages between a called program and the return to the caller (Chapter 9). From this foundation, it is not a large step to explain func- tions by introducing run-time activation records (Chapter 15), with a lot of the mystery about argument passing, dynamic declarations, and so on, going away. Since a function can call a function, it is one additional small step (certainly no magic involved) for a function to call itself.
Horn to Use This Boo~
We have discovered over the past two years that there are many ways the material in this book can be presented in class effectively. We suggest six presentations below;
I. The Michigan model. First course, no formal prerequisites. Very intensive, this course covers the entire book. We have found that with talented, very highly motivated students, this works best.
2. Nonna! usage. First course, no prerequisites. This course is also intensive, although less so. It covers most of the book, leaving out Sections 10.3 and 10.4 of Chapter 10, Chapters 16 (recursion), 18 (the details of C 1/0), and
19 (data structures).
3. Second course. Several schools have successfully used the book in their second course, after the students have been exposed to programming with an object-oriented programming language in a milder first course. In this second course, the entire book is covered, spending the first two-thirds of the semester on the first 10 chapters, and the last one-third of the semester on the second half of the book. The second half of the book can move more quickly, given that it follows both Chapters 1-10 and the
introductory programming course, which the student has already taken. Since students have experience with programming, lengthier programming projects can be assigned. This model allows students who were introduced to programming via an object-oriented language to pick up C, which they will certainly need if they plan to go on to advanced software courses such as operating systems.
4. Two quarters. An excellent use of the book. N,, prerequisites, the entire book can be covered easily in two quarters, the first quarter for Chapters
1-10, the second quarter fl r Chapters 11-19.
5. Two semesters. Perhaps the -1ptimal use of the b,-,:ik. A two-semester sequence for freshmcu. No formal prerequisites. First semester, Chapters 1-10, with supplemental material from Appendix C, the Microarchitecture of the LC-2. Second semester, Chapters 11-19 with additional substantial programming projects so that the students can solidify the concepts they learn in lectures.
6. A sophomore course in computer hardware. Some universities have found the book useful for a sophomore level breadth-first survey of computer hardware. They wish to introduce students in one semester to number systems, digital logic, computer organization, machine language and assembly language programming, finishing up with the material on stacks, activation records, recursion, and linked lists. The idea is to tie the hardware knowledge the students have acquired in the first part of the course to some of the harder to understand concepts that they struggled with in their freshman programming course. We strongly believe the better paradigm is to study the material in this book before tackling an object-oriented language. Nonetheless, we have seen this approach used successfully, where the sophomore student gets to understand the concepts in this course, after struggling with them during the freshman year.
Some Observations
Understanding, Not Memorizing
Since the course builds from the bottom up, we have found that less memorization of seemingly arbitary rules is required than in traditional programming courses. Students understand that the rules make sense since by the time a topic is taught, they have an awareness of how that topic is implemented at the levels below it. This approach is good preparation for later courses in design, where understanding of and insights gained from fundamental underpinnings are essential to making the required design tradeoffs.
The Student Debugs the Student’s Program
We hear complaints from industry all the time about CS graduates not being able to program. Part of the problem is the helpful teaching assistant, who contributes far too much of the intellectual component of the student’s program, so the student
preface to the -first edition xxi
xxii
preface to the first edition
never has to really master the art. Our approach is to push the student to do the job without the teaching assistant (TA). Part of this comes from the bottom- up approach where memorizing is minimized and the student builds on what he/she already knows. Part of this is the simulator, which the student uses from day one. The student is taught debugging from the beginning and is required to use the debugging tools of the simulator to get his/her programs to work from the very beginning. The combination of the simulator and the order in which the subject material is taught results in students actually debugging their own programs instead of taking their programs to the TA for help . . . and the common result that the TAs end up writing the programs for the students.
Preparation for the Future: Cutting Through Protective Layers
In today’s real world, professionals who use computers in systems but remain ignorant of what is going on underneath are likely to discover the hard way that the effectiveness of their solutions is impacted adversely by things other than the actual programs they write. This is true for the sophisticated computer programmer as well as the sophisticated engineer.
Serious progrnrruners will write more efficient code if they understand what is going on beyond the statements in their high-level language. Engineers, and not just computer engineers, are having to interact with their computer systems today more and more at the device or pin level. In systems where the computer is being
used to sample data from some metering device such as a weather meter or feed- back control system, the engineer, needs to know more than just how to program in FORTRAN. This is true of mechanical, chemical, and aeronautical engineers today, not just electrical engineers. Consequently, the high-level programming language course, where the compiler protects the student from everything “ugly” underneath, does not serve most engineering students well, and certainly does not prepare them for the future.
Rippling Effects Through the Curriculum
The material of this text clearly has a rippling effect on what can be taught in subsequent courses. Subsequent programming courses can not only assume the students know the syntax of C but also understand how it relates to the under- lying architecture. Consequently, the focus can be on problem solving and more sophisticated data structures. On the hardware side, a similar effect is seen in courses in digital logic design and in computer organization. Students start the logic design course with an appreciation of what the logic circuits they master are good for. In the computer organization course, the starting point is much further along than when students are seeing the term Program Counter for the first time. Feedback from Michigan faculty members in the follow-on courses have noticed substantial improvement in students’ comprehension, compared to what they saw before students took EECS JOO.
Hcknowledgments
This book has benefited greatly from important contributions of many, many people. At the risk of leaving out some, we would at least like to acknowledge the following.
First, Professor Kevin Compton. Kevin believed in the concept of the book since it was first introduced at a curriculum committee meeting that he chaired al Michigan in 1993. The book grew out of a course (EECS 100) that he and the first author developed together, and co-taught the first three semesters it was offered at Michigan in fall 1995, winter 1996, and fall 1996. Kevin’s insights into programming methodology (independent of the syntax of the particular language) provided a sound foundation for the beginning student The course at Michigan and this book would be a lot less were it not for Kevin’s influence.
Several other students and faculty at Michigan were involved in the early years of EECS 100 and the early stages of the book. We arc particular!y grateful for the help of Professor David Kieras, Brian Hartman, David Armstrong, Matt Postiff, Dan Friendly, Rob Chappell, David Cybulski, Sangwook Kim, Don Winsor, and Ann Ford.
We also benefited enormously from TAs who were committed to helping students learn. The focus was always on how to explain the concept so the student gets it. We acknowledge, in particular, Fadi Aloul, David Armstrong, David Baker, Rob Chappell, David Cybulski, Amolika Gurujee, Brian Hartman, Sangwook Kim, Steve Maciejewski, Paul Racunas, David Telehowski, Francis Tseng, Aaron Wagner, and Paul Watkins.
We were delighted with the response from the publishing world to our manuscript. We ultimately decided on McGraw-Hill in large part because of the editor, Betsy Jones. Once she checked us out, she became a strong believer in what we are trying to accomplish. Throughout the process, her commitment and energy level have been greatly appreciated. We also appreciate what Michelle Flomenhoft has brought to the project. It has been a pleasure to work with her.
Our book has benefited from extensive reviews provided by faculty members at many universities. We gratefully acknowledge reviews provided by Carl D. Crane Ill, Florida, Nat Davis, Virginia Tech, Renee Elio, University of Alberta, Kelly Flangan, BYU, George Friedman, UIUC, Franco Fummi, Universita di Verona, Dale Grit, Colorado State, Thor Guisrud, Stavanger College, Brad Hutch- ings, BYU, Dave Kacli, Northeastern, Rasool Kenarangui, UT at Arlington, Joel Kraft, Case Western Reserve, Wei-Ming Lin, UT at San Antonio, Roderick Loss, Montgomery College, Ron Meleshko, Grant MacEwan Community College, Andreas Moshovos, Northwestern, Tom Murphy, The Citadel, Murali Narayanan, Kansas State, Carla Purdy, Cincinnati, T. N. Rajashekhara, Camden County Col-
lege, Nello Scarabottolo, Universita degli Studi di Milano, Robert Schaefer, Daniel Webster College, Tage Stabell-Kuloe, University ofTromsoe, Jean-Pierre Steger, Burgdorf School of Engineering, Bill Sverdlik, Eastern Michigan, John Trono, St. Michael’s College, Murali Varansi, University of South Florida, Montanez Wade, Tennessee State. and Carl Wick, US Naval Academy.
preface to the first edition
xxiii
xxiv preface to the first edition
In addition lo all these people, there were others who contributed in many different and sometimes umque ways. Space dictates that we simply list them and say thank you. Susan Kornfield, Ed DeFranco, Evan Gsell, Rich Belgard, Tom Conte, Dave Nagle, Bruce Shriver, Bill Sayle, Steve Lumetta, Dharma Agarwal, David Lilja, and Michelle Chapman.
Finally, if you will indulge the first author a bit: This book is ~bout developing a strong foundation in the fundamentals with the fervent belief that once that is accomplished, students can go as far as their talent and energy can take them. This objective was instilled in me by the professor who taught me how to be a professor, Professor William K. Linvill. It has been more than 35 years since I was in his classroom, but I still treasure the example he set.
nFinal Word
We hope you will enjoy the approach taken in this book. Nonetheless, we are mindful that the current version will always be a work in progress, and both of us welcome your comments on any aspect of it. You can reach us by email at patt@ece.utcxas.edu and sjp@crhc.uiuc.edu. We hope you will.
Yale N. Patt Sanjay J. Patel March, 2000
Welcome Rboard
1.1 Wh~t We Will Tr~ to Do
Welcome to From Bits and Gates to C and Beyond. Our intent is to introduce you over the next 632 pages to come, to the world of computing. As we do so, we have one objective above all others: to show you very clearly that there is no magic to computing. The computer is a deterministic system–evcry time we hit it over the head in the same way and in the same place (provided, of course, it was in the same starting condition), we get the same response. The computer is not an electronic genius; on the contrary, if anything, it is an electronic idiot, doing exactly what we tell it to do. It has no mind of its own.
What appears to be a very complex organism is really just a huge, system- atically interconnected collection of very simple parts. Our job throughout this book is to introduce you to those very simple parts, and, step-by-step, build the interconnected structure that you know by the name computer. Like a house, we will start at the bottom, construct the foundation first, and then go on to add layers and layers, as we get closer and closer to what most people know as a full-blown computer. Each time we add a layer, we will explain what we are doing, tying the new ideas to the underlying fabric. Our goal is that when we are done, you will be able to write programs in a computer language such as C, using the sophisticated features of that language, and understand what is going on underneath, inside the computer.
chapte
1
2
chapter 1 Welcome Aboard
l.2 HowWeWillGetThere
We will start (in Chapter 2) by noting that the computer is a piece of electronic equipment and, as such, consists of electronic parts interconnected by wires. Every wire in the computer, at every moment in time, is either at a high voltage or a low voltage. We do not differentiate exactly how high. For example, we do not distinguish voltages of 115 volts from voltages of 118 volts. We only care whether there is or is not a large voltage relative to 0 volts. That absence or presence of a large voltage relative to 0 volts is represented as 0 or 1.
We will encode all information as sequences of Os and ls. For example, one encoding of the letter a that is commonly used is the sequence 01100001. One encoding of the decimal number 35 is the sequence 00100011. We will see how to perform operations on such encoded information.
Once we are comfortable with information represented as codes made up of Os and Is and operations (addition, for example) being performed on these representations, we will begin the process of showing how a computer works. In Chapter 3, we will see how the transistors that make up today’s microproces-
sors work. We will further see how those transistors arc combined into larger structures that perform operations, such as addition, and into structures that allow us to save information for later use. ln Chapter 4, we will combine these larger structures into the Von Neumann machine, a basic model that describes how a computer works. In Chapter 5, we will begin to study a simple computer, the LC-3. LC-3 stands for Little Computer 3; we started with LC-I but needed two more shots at it before we got it right! The LC-3 has all the important characteristics of the microprocessors that you may have already heard of, for example, the Intel 8088, which was used in the first IBM PCs back in 1981. Or the Motorola 68000, which was used in the Macintosh, vintage 1984. Or the Pen- tium IV, one of the high-performance microprocessors of choice in the PC of the
year 2003. That is, the LC-3 has all the important characteristics of these “real”
microprocessors, without being so complicated that it gets in the way of your
understanding.
Once we understand how the LC-3 works, the next step is to program it, first
in its own language (Chapter 6), then in a language called assembly language that is a little bit easier for humans to work with (Chapter 7). Chapter 8 deals with the problem of getting information into (input) and out of (output) the LC-3. Chapter 9 covers two sophisticated LC-3 mechanisms, TRAPs and subroutines.
We conclude our introduction to programming the LC-3 in Chapter 10 by first introducing two important concepts (stacks and data conversion), and then by showing a sophisticated example: an LC-3 program that carries out the work of a handheld calculator.
In the second half of the book (Chapters 11-19), we tum our attention to a high-level programming language, C. We include many aspects of C that are usually not dealt with in an introductory textbook. In almost all cases, we try to tie high-level C constructs to the underlying LC-3, so that you will understand what you demand of the computer when you use a particular construct in a C program.
Our treatment of C starts with basic topics such as variables and operators (Chapter 12), control structures (Chapter 13), and functions (Chapter 14). We then
move on to the more advanced topics of debugging C programs (Chapter 15), recursion (Chapter 16), and pointers and arrays (Chapter 17).
We conclude our introduction to C by examining two very common high-level constructs, input/output in C (Chapter 18) and the linked list (Chapter 19).
1.3 TwoRecurringThemes
Two themes permeate this book that we have previously taken for granted, assuming that everyone recognized their value and regularly emphasized them to students of engineering and computer science. Lately, it has become clear to us that from the git-go, we need to make these points explicit. So, we state them here up front. The two themes are (a) the notion of abstraction and (b) the impor- tance of not separating in your mind the notions of hardware and software. Their value to your development as an effective engineer or computer scientist goes well beyond your understanding of how a computer works and how to program it.
The notion of abstraction is central to all that you will learn and expect to use in practicing your craft, whether it be in mathematics, physics, any aspect of engineering. or business. It is hard to think of any body of knowledge where the notion of abstraction is not central. The misguided hardware/software separation is directly related to your continuing study of computers and your work with them. We will discuss each in turn.
1.3.1 The Notion of Abstraction
The use of abstraction is all around us. When we get in a taxi and tell the driver, “Take me to the airport,” we are using abstraction. Ifwe had to, we could probably direct the driver each step of the way: “Go down this street ten blocks, and make a left turn.” And, when he got there, “Now take this street five blocks and make a right turn.” And on and on. You know the details, but it is a lot quicker to just tell the driver to take you to the airport.
Even the statement “Go down this street ten blocks …” can be broken down further with instructions on using the accelerator, the steering wheel, watching out for other vehicles, pedestrians, etc.
Our ability to abstract is very much a productivity enhancer. It allows us to deal with a situation at a higher level, focusing on the essential aspects, while keeping the component ideas in the background. It allows us to be more efficient in our use of time and brain activity. It allows us to not get bogged down in the detail when everything about the detail is working just fine.
There is an underlying assumption to this, however: “when everything about the detail is just fine.” What if everything about the detail is not just fine? Then, to be successful, our ability to abstract must be combined with our ability to un-abstract. Some people use the word decrmstruct-the ability to go from the abstraction back to its component parts.
Two stories come to mind.
The first involves a trip through Arizona the first author made a long time ago in the hottest part of the summer. At the time I was living in Palo Alto, California, where the temperature tends to be mild almost always. I knew enough to take
1.3 Two Recurring Themes 3
4 chapter 1 Welcome Aboard
the car to a mechanic before making the trip, and I told him to check the cooling system. That was the abstraction: cooling system. What I had not mastered was that the capability of a cooling system for Palo Alto, California is not the same as the capability of a cooling system for the summer deserts of Arizona. The result: two days in Deer Lodge, Arizona (population 3), waiting for a head gasket to be shipped in.
The second story (perhaps apocryphal) is supposed to have happened during the infancy of electric power generation. General Electric Co. was having trouble with one of its huge electric power generators and did not know what to do. On the front of the generator were lot~ of dials containing lots of information, and lots of screws that could be rotated clockwise or counterclockwise as the operator wished. Something on the other side of the wall of dials and screws was malfunctioning and no one knew what to do. So, as the story goes, they called in one of the early giants in the electric power industry. He looked at the dials and listened to the noises for a minute, then took a small pocket screwdriver out of his geek pack and rotated one screw 35 degrees counterclockwise. The problem immediately went away. He submitted a bill for $1,000 (a lot of money in those days) without any elaboration. The controller found the bill for two minutes’ work a little unsettling, and asked for further clarification. Back came the new bill:
Turning a screw 35 degrees counterclockwise: $ 0.75
Knowing which screw to turn and by how much, 999.25
In both stories the message is the same. It is more efficient to think of entities as abstractions. One does not want to get bogged down in details unnecessarily. And as long as nothing untoward happens, we are OK. Ifl had never tried to make the trip to Arizona, the abstraction “cooling system” would have been sufficient. If the electric power generator never malfunctioned, there would have been no need for the power engineering guru’s deeper understanding.
When one designs a logic circuit out of gates, it is much more efficient to not have to think about the internals of each gate. To do so would slow down the process of designing the logic circuit. One wants to think of the gate as a component. But if there is a problem with getting the logic circuit to work, it is often helpful to look at the internal structure of the gate and see if something about its functioning is causing the problem.
When one designs a sophisticated computer application program, whether it be a new spreadsheet program, word processing system, or computer game, one wants to think of each of the components one is using as an abstraction. If one spent time thinking about the details of a component when it is not necessary, the distraction could easily prevent the total job from ever getting finished. But when there is a problem putting the components together, it is often useful to ex.amine carefully the details of each component in order to uncover the problem.
The ability to abstract is a most important skill. In our view, one should try to keep the level of abstraction as high as possible, consistent with getting everything to work effectively. Our approach in this book is to continually raise the level of abstraction. We describe logic gates in terms of transistors. Once we understand the abstraction of gates, we no longer think in terms of transistors. Then we build
larger structures out of gates. Once we understand these larger abstractions, we no longer think in terms of gates.
The Bottom Line
Abstractions allow us to be much more efficient in dealing with all kinds of situations. It is also true that one can be effective without understanding what is below the abstraction as long as everything behaves nicely. So, one should not pooh-pooh the notion ofabstraction. On the contrary, one should celebrate it since it allows us to be more efficient.
In fact, if we never have to combine a component with anything else into a larger system, and if nothing can go wrong with the component, then it is perfectly fine to understand this component only at the level of its abstraction.
But if we have to combine multiple components into a larger system, we should be careful not to allow their abstractions to be the deepest level of our understanding. If we don’t know the components below the level of their abstractions, then we are at the mercy of them working together without our intervention. If they don’t work together, and we are unable to go below the level of abstraction, we are stuck. And that is the state we should take care not to find ourselves in.
1.3.2 Hardware versus Software
Many computer scientists and engineers refer to themselves as hardware people or software people. By hardware, they generally mean the physical computer and all the specifications associated with it. By software. they g-enerally mean the pr•:h1
2.4.2 Decimal to Binary Conversion
Converting from decimal to 2’s complement is a little more complicated. The crux of the method is to note that a positive binary number is odd if the rightmost digit is I and even if the rightmost digit is 0.
Consider again our generic eight-bit representation:
a7·27+a6·26+as·25+a4·24+a3·23+a2·22+a1 ·21+ao·2°
We can illustrate the conversion best by first working through an example. Suppose we wish to convert the value +105 to a 2’s complement binary code. We note that +105 is positive. We first find values for a;, representing the magnitude 105. Since the value is positive, we will then obtain the 2’s complement
result by simply appending a7, which we know is 0.
Our first step is to find values for a; that satisfy the following:
105 =a6 · 26 +as ·25 +a4 ·24+a3 · 23+a2 · 22 +a1 · 21 +ao · 2° Since 105 is odd, we know that ao is I. We subtract I from both sides of the
equation, yielding
104 = a6 · 26 +as · 25 +a4 · 24 +a3 · 23 +a2 · 22 +a 1 · 21
We next divide both sides of the equation by 2, yielding 52=a6·25+as·24+a4·23+a3·22+a2·21+a1 ·2°
We note that 52 is even, so a1, the only coefficient not multiplied by a power of 2, must be equal to 0.
We now iterate the process, each time subtracting the rightmost digit from both sides of the equation, then dividing both sides by 2, and finally noting whether the new decimal number on the left side is odd or even. Starting where we left off, with
52 = a6 · 25 +,a5 · 24 +a4 · 23 +a3 · 22 +a2 · 21 the process produces, in turn:
26 = a6 · 24 +a5 · 23 +a4 · 22 +a3 · 2l +a2 · 20
Therefore, a2 = 0.
Therefore, a3 = I.
Therefore, a4 = 0.
Therefore, as = 1.
1 = a6 · 2°
Therefore, a6 = 1, and we are done. The binary representation is 01101001. Let’s summarize the process. If we are given a decimal integer value N, we
construct the 2’s complement representation as follows:
1. We first obtain the binary representation of the magnitude of N by forming the equation
N =a6·26+a5·25+a4·24+a3·23+a2·22+a1·21+ao·2° and repeating the following, until the left side of the equation is 0:
a. If N is odd, the rightmost bit is 1. If N is even, the rightmost bit is 0.
b. Subtract 1 or 0 (according to whether N is odd or even) from N, remove the least significant term from the right side, and divide both sides of the equation by 2.
Each iteration produces the value of one coefficient a;.
2. If the original decimal number is positive, append a leading 0 sign bit, and
you are done.
3. If the original decimal number is negative, append a leading 0 and then form the negative of this 2’s complement representation, and then you are done.
2.5 OperationsonBits-PartI:Rrithmetic
2.5.1 Addition and Subtraction
Arithmetic on 2’s complement numbers is very much like the arithmetic on decimal numbers that you have been doing for a long time.
Addition still proceeds from right to left, one digit at a time. At each point, we generate a sum digit and a carry. Instead of generating a carry after 9 (since 9 is the largest decimal digit), we generate a carry after 1 (since 1 is the largest binary digit).
2.5 Operations on Bits-Part I: Arithmetic
29
30
chapter 2 Bits, Data Types, and Operations
Subtraction is simply addition, preceded by determining the negative of the number to be subtracted. That is, A – B is simply A +(-B).
2.5.2 Sign-Extension
It is often useful to represent a small number with fewer bits. For example, rather than represent the value 5 as 0000000000000101, there are times when it is useful
to allocate only six bits to represent the value 5: 000101. There is little confusion, since we are all used to adding leading zeros without affecting the value of a number. A check for $456.78 and a check for $0000456.78 are checks having the same value.
What about negative representations? We obtained the negative representation from its positive counterpart by complementing the positive representation and adding 1. Thus, the representation for -5, given that 5 is represented as 000101, is 11101 1. If 5 is represented as 0000000000000101, then the representation for – 5 is 111111111111101 l. In the same way that leading Os do not affect the value of a positive number, leading 1s do not affect the value of a negative number.
In order to add representations of different lengths, it is first necessary to represent them with the same number of bits. For example, suppose we wish to add the number 13 to -5, where 13 is represented as 0000000000001101 and -5 is represented as 111011. If we do not represent the two values with the same number of bits, we have
0000000000001101 + 111011
When we attempt to perform the addition, what shall we do with the missing bits in the representation for -5′? lfwe take the absence of a bit to be a 0, then we are no longer adding -5 to 13. On the contrary, if we take the absence of bits to be Os, we have changed the -5 to the number represented as 0000000000111011, that is +59. Not surprisingly, then, our result turns out to be the representation for 72.
However, if we understand that a six-bit – 5 and a 16-bit – 5 differ only in the number of meaningless leading 1s, then we first extend the value of – 5 to 16 bits before we perform the addition. Thus, we have
0000000000001101 + 1111111111111011 0000000000001000
and the result is +8, as we should expect.
The value of a positive number does not change if we extend the sign bit
0 as many bit positions to the left as desired. Similarly, the value of a negative number does not change by extending the sign bit 1 as many bit positions to the left as desired. Since in both cases, it is the sign bit that is extended, we refer to the operation as Sign-EXTension, often abbreviated SEXT. Sign-extension is performed in order to be able to operate on bit patterns of different lengths. It does not affect the values of the numbers being represented.
2.5.3 Overflow
Up to now, we have always insisted that the sum of two integers be small enough to be represented by the available bits. What happens if such is not the case?
You are undoubtedly familiar with the odometer on the front dashboard of your automobile. It keeps track of how many miles your car has been driven-but only up to a point. In the old days, when the odometer registered 99992 and you
2.5 Operations on Bits-Part I: Arithmetic 31
chapter 2 Bits, Data Types, and Operations
drove it 100 miles, its new reading became 00092. A brand new car! The problem, as you know, is that the largest value the odometer could store was 99999, so the value I00092 showed up as 00092. The carryout of the ten-thousands digit was lost. (Of course, if you grew up in Boston, the carryout was not lost at all-it was in full display in the rusted chrome all over the car.)
We say the odometer overflowed. Representing 100092 as 00092 is unac- ceptable. As more and more cars lasted more than I00,000 miles, car makers felt the pressure to add a digit to the odometer. Today, practically all cars overflow at 1,000,000 miles, rather than I00,000 miles.
The odometer provides an example of unsigned arithmetic. The miles you add are always positive miles. The odometer reads 000129 and you drive 50 miles. The odometer now reads 000179. Overflow is a carry out of the leading digit.
In the case of signed arithmetic, or more particularly, 2’s complement arithmetic, overflow is a little more subtle.
Let’s return to our five-bit 2’s complement data type, which allowed us to represent integers from -16 to + 15. Suppose we wish to add +9 and + 11. Our arithmetic takes the following form:
01001 01011 10100
Note that the sum is larger than +15, and therefore too large to represent with our 2’s complement scheme. The fact that the number is too large means that the number is larger than O1111, the largest positive number we can represent with a five-bit 2’s complement data type. Note that because our positive result was larger than + 15, it generated a carry into the leading bit position. But this bit position is used to indicate the sign of a value. Thus detecting that the result is too large is an easy matter. Since we are adding two positive numbers, the result must be positive. Since the ALU has produced a negative result, something must be wrong. The thing that is wrong is that the sum of the two positive numbers is too large to be represented with the available bits. We say that the result has overflowed the capacity of the representation.
Suppose instead, we had started with negative numbers, for example, -12 and -6. In this case our arithmetic takes the following form:
10100 11010 01110
Herc, too, the result has overflowed the capacity of the machine, since -12 + -6 equals -18, which is “more negative” than -16, the negative number with the largest allowable magnitude. The ALU obliges by producing a positive result. Again, this is easy to detect since the sum of two negative numbers cannot be positive.
2.6 Operations on Bits-Part II: Logical Operations 33 Note that the sum of a negative number and a positive number never presents
a problem. Why is that? See Exercise 2.25.
2.6 OperationsonBits-PartII:LogicalOperations
We have seen that it is possible to perform arithmetic (e.g., add, subtract) on values represented as binary patterns. Another class of operations that it is useful to perform on binary patterns is the set of logical operations.
Logical operations operate on logical variables. A logical variable can have one of two values, 0 or 1. The name logical is a historical one; it comes from the fact that the two values Oand 1 can represent the two logical values false and true, but the use of logical operations has traveled far from this original meaning.
There are several basic logic functions, and most ALUs perform all of them.
2.6.1 The AND Function
AND is a binary logical function. This means it requires two pieces of input data. Said another way, AND requires two source operands. Each source is a logical variable, taking the value Oor 1. The output of AND is 1 only if both sources have the value 1. Otherwise, the output is 0. We can think of the AND operation as the ALL operation; that is, the output is I only if ALL two inputs are I. Otherwise, the output is 0.
A convenient mechanism for representing the behavior of a logical operation is the truth table. A truth table consists of n + 1 columns and 2″ rows. The first n columns correspond to the n source operands. Since each source operand is a logical variable and can have one of two values, there are 2″ unique values that these source operands can have. Each such set of values (sometimes called an input combination) is represented as one row of the truth table. The final column in the truth table shows the output for each input combination.
In the case of a two-input AND function, the truth table has two columns for source operands, and four (22) rows for unique input combinations.
A B AND 000 010 100 111
We can apply the logical operation AND to two bit patterns of m bits each. This involves applying the operation individually to each pair of bits in the two source operands. For example, if a and b in Example 2.6 are 16-bit patterns, then c is the AND of a and b. This operation is often called a bit-wise AND.
34
chapter 2 Bits, Data Types, and Operations
2.6.2 The OR Function
OR is also a binary logical function. It requires two source operands, both of which are logical variables. The output of OR is I if any source has the value 1. Only if both sources are Ois the output 0. We can think of the OR operation as the ANY operation; that is, the output is I if ANY of the two inputs are 1.
The truth table for a two-input OR function is
A B OR aao a11 1a1 111
In the same way that we applied the logical operation AND to two m-bit patterns, we can apply the OR operation bit-wise to two m-bit patterns.
2.6 Operations on Bits-Part II: Logical Operations 35
2.6.3 The NOT Function
NOT is a unary logical function. This means it operates on only one source operand. It is also known as the complement operation. The output is formed by complementing the input. We sometimes say the output is formed by inverting the input. A I input results in a Ooutput. A Oinput results in a I output.
The truth table for the NOT function is
In the same way that we applied the logical operation AND and OR to two m-bit patterns, we can apply the NOT operation bit-wise to one m-bit pattern. If a is as before, then c is the NOT of a.
a: 0011101001101001 C: 1100010110010110
2.6.4 The Exclusive-OR Function
Exclusive-OR, often abbreviated XOR, is a binary logical function. It, too, requires two source operands, both of which are logical variables. The output of XOR is 1 if the two sources are different. The output is Oif the two sources arc the same.
36
chapter 2 Bits, Data Types, and Operations
Example 2.9
The truth table for the XOR function is
A B XOR 000 011 101 110
In the same way that we applied the logical operation AND to two m-bit patterns, we can apply the XOR operation bit-wise to two m-bit patterns.
·•· r r i l t l l \ l ~ ~ J Y c ~ i t ~ ; ; ~ ~ t ~ w { ~ ~ ~ l i ~ . – . ~ ~ ~ – : b ,
t/ ~l);J.~f()l,l)(){.fgJ’.~·dt l:J :.. ·01Jl1lO-OlOQl;J)QQ!,Jl c! o,;10001101001000
~ore.thedlstinctionbet\Vtieii~truJi~1ef<)l'XOkis~W;bere!Uldtlt~Wt1i"~1~fdr .. O ~ s ~ n el!fl!er.111 the~ ()fexc!U$ive•OR,ifb
~pSYl’IIESSl:>itvector.Theresult i)\the.bit vect9rO\OOOOlO, …·.·..··.•.. / ,
.•’•:,(… /*ec.tll.~;We~~n~,tl¢CO!l<;ep!pll*~ ip 0fiialJ)Ple 2.,7,~;IJltnatI!..
,r~f~kenall!es•Cll».~int\lfllt.!~SOlllebits.ofll.bi!)al)'{JllJreOl Wnile•i.gnOti!lg·.~.;test.
. l!lthiscase,the1:>itmast9lears bit7a!ldleavesμnchangiid(ignores}bimq.J.Ju:9qgh.o, ·
·•>.. $~pose unit ,5 linisller, its ta~k ll!ld becol!Klsidle. We cat1 update the ~tr;Yl’III’Ss ;J)i(vector br, performing the l o g i ~ OR o f i t with the bit mask 001()0000. ‘rhe result
1so1100010. · ··· · · ··· ·..·· · · · · ·. · ·· ·· · ·. ·
2.7.2 Floating Point Data Type
Most of the arithmetic we will do in this book uses integer values. For example, the LC-3 uses the 16-bit, 2’s complement data type, which provides, in addition to one bit to identify positive or negative, 15 bits to represent the magnitude of the value. With 16 bits used in this way, we can express values between -32,768 and +32,767,thatis,between-215 and+215-1.Wesaytheprecisionofourvalueis 15 bits, and the range is 215 . As you learned in high school chemistry or physics, sometimes we need to express much larger numbers, but we do not require so many digits of precision. In fact, recall the value 6.023-1023 • which you may have been required to memorize back then. The range required to express this value is far greater than the 215 available with 16-bit 2’s complement integers. On the other hand, the 15 bits of precision available with 16-bit 2’s complement integers is overkill. We need only enough bits to express four significant decimal digits (6023).
So we have a problem. We have more bits than we need for precision. But we don’t have enough bits to represent the range.
The floating point data type is the solution to the problem. Instead of using all the bits (except the sign bit) to represent the precision of a value, the floating point data type allocates some of the bits to the range of values (i.e., how big or small) that can be expressed. The rest of the bits (except for the sign bit) are used for precision.
Most ISAs today specify more than one floating point data type. One of them, usually called float, consists of 32 bits, allocated as follows:
1 bit for the sign (positive or negative)
8 bits for the range (the exponent field) 23 bits for precision (the fraction field)
Example 2.11
2.7 Other Representations
37
38
chapter 2
Bits, Data Types, and Operations
Figure 2.2
– – – 8 – 1 – – – – 23 – – – –
IS I exponent I
exponent- 127
N =(-1;5x I.fraction x 2 ,1;;: exponent;;: 254 The floating point data type
In most computers manufactured today, these bits represent numbers according to the formula in Figure 2.2. This formula is part of the IEEE Standard for Floating Point Arithmetic.
Recall that we said that the floating point data type was very much like the sci- entific notation you learned in high school, and we gave the example 6.023 . 1023 . This representation has three parts: the sign, which is positive, the significant dig- its 6.023, and the exponent 23. We call the significant digits the fraction. Note that the fraction is normalized, that is, exactly one nonzero decimal digit appears to the left of the decimal point.
The data type and formula of Figure 2.2 also consist of these three parts. Instead of a fraction (i.e., significant digits) of four decimal digits, we have 23 binary digits. Note that the fraction is normalized, that is, exactly one nonzero binary digit appears to the left of the binary point. Since the nonzero binary digit has to be a 1 ( 1 is the only nonzero binary digit) there is no need to represent that bit explicitly. Thus, the formula of Figure 2.2 shows 24 bits of precision, the 23 bits from the data type and the leading one bit to the left of the binary point that is unnecessary to represent explicitly.
Instead of an exponent of two decimal digits as in 6.023 . 1023 , we have in Figure 2.2 eight binary digits. Instead of a radix of I0, we have a radix of 2. With eight bits to represent the exponent, we can represent 256 exponents. Note that the formula only gives meaning to 254 of them. If the exponent field contains 00000000 (that is, 0) or 11111111 (that is, 255), the formula does not tell you how to interpret the bits. We will look at those two special cases momentarily.
For the remaining 254 values in the exponent field of the floating point data type, the explanation is as follows: The actual exponent being represented is the unsigned number in the data type minus 127. For example, if the actual exponent is +8, the exponent field contains 10000111, which is the unsigned number 135. Note that 135 – 127 = 8. If the actual exponent is -125, the exponent field contains 00000010, which is the unsigned number 2. Note that 2 – 127 = -125.
The third part is the sign bit: 0 for positive numbers, 1 for negative numbers. The formula contains the factor -1′, which evaluates to +1 ifs = 0, and -1 ifs=l.
fraction
t. ·Wlia.:·t·•··,~1~~~g:·~.a~~J;k,>· .
.1·····.·.•·.
~
~
Jir,’},;.
, •.
;,~\;:;::r.·
.·
·
, – ·
–·- .–,
._._, ;:;,~}- –,~>,:;:,./’./;. -_~:·_.\·
..A,..”•
‘,/)·,.<;:;:·_;_;:;;c;..?:,:,_,/,::->: i/o:f;-}_;-J::;::::.-,;<);-;)_;:,>‘.0:>/;,;,,~-i\{·,_:·-:·.·”.. / i>.ot/ili’i!)’i;:t,QlY6{)0:!)’Q,Qt!’dQ006:00UG(i(lc(i’QO, ‘> . >.·.
Example 2.13
/,/ -:–::/_
_
;,•.•·………… we..~that•tlre-irlte!l)ietlltion of•tti6 32 l>its’tequited··that me·ek~t’tield ·..·.co~~OOOOOOOOnodlllHlk.The~Stndiltdfor~g·PointAfitll.;.·.
,, ~~.SJ!llCifies.lrowt()in~rpretthe32bitsiftheexponentfieldeont;lins00000000
> , , .·..·. < C:, J . , • ,
•· 1, the23liitsJ,rtlie:tr~ciq•ffeld..aftQliows: , · · ·, ·•· · ·· ···
--.-.-, ·-· •• -·----·- --.-· ._-._,_.._.. -- ._, .. -- . ii}fii;f;ewnpre.,tbe;.ftoatingpoint.da(atepresentatioo.
0 00000000 0000100000000060000661)0
~ be t;Wllμ;ttpdas follqws: lllel~ding Omeansthe lll!mbet i$J{!Siti~e:~-xteighi ·····~~~~w~~ntJj~qt~:~~&~~,~·tl~?
,·i~s,i.J~;win€11rJ:tt~"···'·· :···· ..•, 4,..... ,.•.•.•..• ,,,,k,,'o·.·", . , , ~ ,...·.·.···· •
.. "thi~;ilJoj~y~lii!YIIIJlllT:,e~tj'jl;t~nbpt¢sente\:I.. · .. '"'
;,iOt./ti,1~1:l~J, .· .
fr,, 1 •.•• l{tht\expqneptJie~~~,.the.~~;:-;12?,!li)d#lr;J>\~i.lkj1⁄4!!t, :.,Jt~gi~ ~”1l!~l)(l.d. l,ystm1!1tg with 4 J e a ~ 9,JoU9w¢,l,ly,a 9i~poiJlI,fqllpw,(,’li .
u,
·
~ ~ ~ ~ J ~ ~ i , . ~ ~ J i : i o v ! > .. > . · iwti:.g,e1.i0<1.t().},,,V~~J&J~~:.·;;;··;•....;: .> ·..•.·:··..;· ;;;,.{4.!::I 1 •
•·r~t%~0$~%t#;8’ii,~~~/1~${S~q~op~~~.91.%~f1t.Jf};~~~ii:tF%~t;i~V·······
~•’tiitisl…s~t~’~~:~;~nti& !lSW ,lsi~-1ari:····
.•·.~i~li~of.1;3Ji;;;,..1~t~rii”~\(:~~ac·ttt?/thlf~Tt.¥~.~.l)Oi~~11~~.•
· u-actiiln~~*n~i:ift11e.~liiifyJ10mtr1clafL{)(J19tAA>Vi111tlh¢.”1itll!rYAAi~r·’·•· Ulre¢1’Q.Sf.UAAS «> tl\PJigll~ /\Y!l g~;l.OOJ,i~, wl!ich is ,-,1),25;; ·.
01’1111 110•l1Y iitli1llll1111.il1.:l.111 is-,,21~
A detailed understanding of IEEE Floating Point Arithmetic is well beyond what should be expected in this first course. Indeed, we have not even considered how to interpret the 32 bits if the exponent field contains 1111 I 11 I. Our purpose in including this section in the textbook is to at least let you know that there is, in addition to 2’s complement integers, another very important data type available in almost all ISAs. This data type is called floating point; it allows very large and very tiny numbers to be expressed at the expense of reducing the number of binary digits of precision.
2.7.3 ASCII Codes
Another representation of information is the standard code that almost all com- puter equipment manufacturers have agreed to use for transferring character codes between the main computer processing unit and the input and output devices. That code is an eight-bit code referred to as ASCII. ASCII stands for American Stan- dard Code for Information Interchange. It (ASCII) greatly simplifies the interface between a keyboard manufactured by one company, a computer made by another company, and a monitor made by a third company.
~~wll\t.~.po$~;m~1:1ght,;
Each key on the keyboard is identified by its unique ASCII code. So, for example, the digit 3 expanded to 8 bits with a leading Ois 00110011, the digit 2 is 00110010, the lowercase e is 01100101, and the carriage return is 00001101. The entire set of eight-bit ASCII codes is listed in Figure E.3 of Appendix E. When you type a key on the keyboard, the corresponding eight-bit code is stored and made available to the computer. Where it is stored and how it gets into the computer is discussed in Chapter 8.
Most keys are associated with more than one code. For example, the ASCII code for the letter E is OI000IOI, and the ASCII code for the letter e is O110010I. Both are associated with the same key, although in one case the Shift key is also depressed while in the other case, it is not.
In order to display a particular character on the monitor, the computer must transfer the ASCII code for that character to the electronics associated with the monitor. That, too, is discussed in Chapter 8.
2.7.4 Hexadecimal Notation
We have seen that information can be represented as 2’s complement integers, as bit vectors, in floating point format, or as an ASCll code. There are other representations also, but we will leave them for another book. However, before we leave this topic, we would like to introduce you to a representation that is used more as a convenience for humans than as a data type to support operations being performed by the computer. This is the hexadecimal notation. As we will see, it evolves nicely from the positional binary notation and is useful for dealing with long strings of binary digits without making errors.
It will be particularly useful in dealing with the LC-3 where 16-bit binary strings will be encountered often.
An example of such a binary string is 0011110101101110
Let’s try an experiment. Cover the preceding 16-bit binary string of Os and Is with one hand, and try to write it down from memory. How did you do? Hexadec- imal notation is about being able to do this without making mistakes. We shall see how.
In general, a 16-bit binary string takes the form
a15 a14 a13 a12 a11 aw a9 as a7 a5 as a4 a3 a2 a1 ao
where each of the bits a; is either Oor 1.
If we think of this binary string as an unsigned integer, its value can be
computed as
15 14 13 12 11 10 a15·2 +a14•2 +a13·2 +a12·2 +a11·2 +aw·2
+ a9 · 29 +as · 28 +a7 · 27 +a5 · 26 +as · 25 +a4 · 24 +a3 · 23 + a2 · 22 +a1 · 21 +ao · 2°
2.7 Other Representations 41
42
chapter 2 Bi.ts, Data Types, and Operations
We can factor 212 from the first foUI terms, 28 from the second four terms, 24 from the third set of four terms, and 2° from the last four terms, yielding
212[a15 · 23 + a14 · 22 + a13 · 21 + a12 · 2°] +28[au -23+a10 -22+a9 ·21+as· 2°] + 24[a7 · 23 + a6 · 22 +as· 21 + a4 · 2°J + 2°[a3 · 23 + a2 · 22 + a, · 21 + ao · 2°]
Note that the largest value inside a set of square brackets is 15, which would be the case if each of the four bits is 1. If we replace what is inside each square bracket by a symbol representing its value (from Oto 15), and we replace 212 by its equivalent 163, 28 by 162, 24 by 161, and 2° by 16°, we have
h3 · 163 + h2 · 162 + h1 · 161+ho· 16° where h3, for example, is a symbol representing
a,s · 23 + a14 · 22 + a13 · 2 1 + a,2 · 2°
Since the symbols must represent values from O to 15, we assign symbols to these values as follows: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F. That is, we representOOOOwiththesymbol0,0001 with the symbol 1, . . . 1001 with 9, 1010 with A, 1011 with B, … 1111 with F. The resulting notation is hexadecimal, or base 16.
So, for example, if the hex digits E92F represent a 16-bit 2’s complement integer, is the value of that integer positive or negative? How do you know?
Now, then, what is this hexadecimal representation good for, anyway? It seems like just another way to represent a number without adding any benefit. Let’s return to the exercise where you tried to write from memory the string
0011110101101110
If we had first broken the string at four-bit boundaries
0011 1101 0110 1110
and then converted each four-bit string to its equivalent hex digit
3D6E
it would have been no problem to jot down (with the string covered) 3D6E.
In summary, hexadecimal notation is mainly used as a convenience for humans. It can be used to represent binary strings that are integers or floating point numbers or sequences of ASCII codes, or bit vectors. It simply reduces the number of digits by a factor of 4, where each digit is in hex (0, 1, 2, … F) instead of binary (0, 1). The usual result is far fewer copying errors due to too many Os
and ls.
2.1 Given n bits, how many distinct combinations of the n bits exist?
2.2 There are 26 characters in the alphabet we use for writing English. What is the least number of bits needed to give each character a unique bit pattern’/ How many bits would we need to distinguish between upper- and lowercase versions of all 26 characters?
2.3 a. Assume that there are about 400 students in your class. If every student is to be assigned a unique bit pattern, what is the minimum number of bits required to do this?
b. How many more students can be admitted to the class without requiring additional bits for each student’s unique bit pattern?
2.4 Given n bits, how many unsigned integers can be represented with then bits? What is the range of these integers?
2.5 Using 5 bits to represent each number, write the representations of 7 and – 7 in 1’s complement, signed magnitude, and 2’s complement integers.
2.6 Write the 6-bit 2’s complement representation of -32.
2.7 Create a table showing the decimal values of all 4-bit 2’s complement numbers.
2.8 a.
What is the largest positive number one can represent in ari’8-bit 2’s
complement code? Write your result in binary and decimal.
b. What is the greatest magnitude negative number one can represent in
an 8-bit 2’s complement code? Write your result in binary and
decimal.
c. What is the largest positive number one can represent in n-bit 2’s
complement code?
d. What is the greatest magnitude negative number one can represent in
n-bit 2’s complement code?
2.9 How many bits are needed to represent Avogadro’s number (6.02 – 1023) in 2’s complement binary representation?
2.10 Convert the following 2’s complement binary numbers to decimal.
a. 1010
b. 01011010
c. 11111110
~ 0011100111010011
2.11 Convert these decimal numbers to 8-bit 2’s complement binary numbers.
a. 102 b. 64
C. 33
d. -128 e. 127
Exercises 43
44
chapter 2 Bits, Data Types, and Operations
2.12 If the last digit of a 2’s complement binary number is 0, then the number is even. If the last two digits of a 2’s complement binary number are 00 (e.g., the binary number 01 IO0), what does that tell you about the number?
2.13 Without changing their values, convert the following 2’s complement binar,y numbers into 8-bit 2’s complement numbers.
a. 1010 C. 1111111000 b. 011001 d. 01
2.14 Add the following bit patterns. Leave your results in binary form.
a. 1011 + 0001 b. 0000 + 1010 C. 1100 + 0011 d. 0101 + 0110 e. 1111 + 0001
2.15 Tl was demonstrated in Example 2.5 that shifting a binary number, one bit to the left is equivalent to multiplying the number by 2. What operation is performed when a binary number is shifted one bit to the right?
2.16 Write the results of the following additions as both 8-bit binary and decimal numbers. For each part, use standard binary addition as described in Section 2.5.1.
a.
b. c.
Add the 1’s complement representation of 7 to the l’s complement representation of – 7.
Add the signed magnitude representation of 7 to the signed magnitude representation of – 7.
Add the 2’s complement representation of 7 to the 2’s complement representation of – 7.
2.17 Add the following 2’s complement binary numbers. Also express the answer in decimal.
a. 01 + 1011
b. 11 + 01010101 C. 0101 + 110
d. 01 + 10
2.18 Add the following unsigned binary numbers. Also, express the answer in decimal.
a, 01 + 1011
b. 11 + 01010101 C. 0101 + 110
d. 01 + 10
2.19 Express the negative value -27 as a 2’s complement integer, using eight bits. Repeat, using 16 bits. Repeat, using 32 bits. What does this illustrate with respect to the properties of sign extension as they pertain to 2’s complement representation’?
2.20
2.21 2.22 2.23 2.24 2.25
2.26
2.27
2.28 2.29
2.30
The following binary numbers are 4-bit 2’s complement binary numbers. Which of the following operations generate overflow? Justify your answer by translating the operands and results into decimal.
a. 1100 + 0011 d. 1000 – 0001 b. 1100 + 0100 e. 0111 + 1001 C. 0111 + 0001
Describe what conditions indicate overflow has occurred when two 2’s complement numbers are added.
Create two 16-bit 2’s complement integers such that their sum causes an overflow.
Describe what conditions indicate overflow has occurred when two unsigned numbers are added.
Create two 16-bit unsigned integers such that their sum causes an overflow.
Why does the sum of a negative 2’s complement number and a positive 2’s complement number never generate an overflow?
You wish to express -64 as a 2’s complement number.
a. b.
c.
How many bits do you need (the minimum number)?
With this number of bits, what is the largest positive number you can represent? (Please give answer in both decimal and binary).
With this number of bits, what is the largest unsigned number you can represent? (Please give answer in both decimal and binary).
The LC-3, a 16-bit machine adds the two 2’s complement numbers 0101010101010101 and 0011100 I 11001111, producing
1000111100100100. Is there a problem here? If yes, what is the problem? I f no, why not?
When is the output of an AND operation equal to I?
Fill in the following truth table for a one-bit AND operation.
X Y XANDY
00
0
0
Compute the following. Write your results in binary.
a. 01010111 AND 11010111
b. 101 AND 110
c. 11100000 AND 10110100
d. 00011111 AND 10110100
e. (0011 AND 0110) AND 1101 f 0011 AND (0110 AND 1101)
Exercises 45
46
chapter 2 Bits, Data Types, and Operations
2.31 2.32
2.33
2.34
2.35 2.36
When is the output of an OR operation equal to 1?
Fill in the following truth table for a one-bit OR operation.
Compute the following:
~ 01010111 OR 11010111
b. 101 OR 110
~ 11100000 OR 10110100
d 00011111 OR 10110100
e. (0101 OR 1100) OR 1101
f 0101 OR (1100 OR 1101) Compute the following:
a. NOT (1011) OR NOT (1100)
b. NOT (1000 AND (1100 OR 0101) ) c. NOT (NOT (1101))
d. (0110 OR 0000) AND 1111
In Example 2.11, what are the masks used for? Refer to Example 2.11 for the following questions.
a. b.
c.
d.
e.
What mask value and what operation would one use to indicate that machine 2 is busy?
What mask value and what operation would one use to indicate that machines 2 and 6 are no longer busy? (Note: This can be done with only one operation.)
What mask value and what operation would one use to indicate that all machines are busy?
What mask value and what operation would one use to indicate that all machines are idle?
Develop a procedure to isolate the status bit of machine 2 as the sign bit. For example, if the BUSYNESS pattern is 01011100, then the output of this procedure is 10000000. If the BUSYNESS pattern is 01110011, then the output is 00000000. In general, if the BUSYNESS pattern is:
the output is:
1b2 1oIoIoIoIoIoIo1.
Hint: What happens when you ADD a bit pattern to itself?
X Y XORY 00
0I
0
2.37 If n and m are both 4-bit 2’s complement numbers, ands is the 4-bit result of adding them together, how can we determine, using only the logical operations described in Section 2.6, if an overflow occurred during the addition? Develop a “procedure” for doing so. The inputs to the procedure are n, m, ands, and the output will be a bit pattern of all zeros (0000) if no overflow occurred and I000 if an overflow did occur.
2.38 If n and m are both 4-bit unsigned numbers, ands is the 4-bit result of adding them together, how can we determine, using only the logical operations described in Section 2.6, if an overflow occurred during the addition? Develop a “procedure” for doing so. The inputs to the procedure are n, m, ands, and the output will be a bit pattern of
all zeros (0000) if no overflow occurred and I000 if an overflow did occur.
2.39 Write IEEE floating point representation of the following decimal numbers.
a. 3.75
b. -55~
c. 3.1415927 d. 64,000
2.40 Write the decimal equivalents for these IEEE floating point numbers.
~ 0 10000000 00000000000000000000000 b 1 10000011 00010000000000000000000 C. 0 11111111 00000000000000000000000 d 1 10000000 10010000000000000000000
2.41 a. What is the largest exponent the IEEE standard allows for a 32-bit floating point number?
b. What is the smallest exponent the IEEE standard allows for a 32-bit floating point number?
2.42 A computer programmer wrote a program that adds two numbers. The programmer ran the program and observed that when 5 is added to 8, the result is the character m. Explain why this program is behaving erroneously.
2.43 Translate the following ASCII codes into strings of characters by interpreting each group of eight bits as an ASCII character.
a. x48656c6c6f21
b x68454c4c4f21
C. x436f6d7075746572732) d. x4c432d32
Exercises 47
48
chapter 2 Bits, Data Types, and Operations
2.44
2.45
2.46
2.47
2.48
2.49
What operation(s) can be used to convert the binary representation for 3 (i.e., 00000011) into the ASCII representation for 3 (i.e., 0011 0011 )? What about the binary 4 into the ASCII 4? What about any digit?
Convert the following unsigned binary numbers to hexadecimal.
~ 1101 0001 1010 1111 b. 001 1111
C. 1
~ 1110 1101 1011 0010
Convert the following hexadecimal numbers to binary.
a. xIO
b. x801
c. xF731
d. x0FIE2D e. xBCAD
Convert the following hexadecimal representations of 2’s complement binary numbers to decimal numbers.
a. xF0
b. x7FF C. xl6
d. x8000
Convert the following decimal numbers to hexadecimal representations of 2’s, complement numbers.
a. 256
b. 111
C. 123,456,789 d. -44
Perform the following additions. The corresponding 16-bit binary numbers are in 2’s complement notation. Provide your answers in hexadecimal.
2.50
What else can you say about the answers to parts c and d? Perform the following logical operations. Express your answers in
x025B + x26DE x7D96 + xF0A0 xA397 + xA35D x7D96 + x7412
a. h. c. d. e.
hexadecimal notation.
a. x5478 AND xFDEA
b. xABCD OR xl234
c. NOT((NOT(xDEFA)) AND (NOT(xFFFF)))
d. x00FF XOR x325C
Exercises 49
2.51 What is the hexadecimal representation of the following numbers’!
a. 25,675
b. 675.625 (that is, 675i), in the IEEE 754 floating point standard
c. The ASCII string: Hello
2.52 Consider two hexadecimal numbers: x434F4D50 and x55544552. What values do they represent for each of the five data types shown?
Unsigned binary l’s complement 2’s complement
IEEE 754 floating point ASCII string
2.53 Fill in the truth table for the equations given. The first line is done as an example.
Q1 =NOT(AANDB)
Q2 = NOT(NOT(A) AND NOT(B))
Express Q2 another way.
2.54 Fill in the truth table for the equations given. The first line is done as an example.
Q1 = NOT(NOT(X) OR (X ANDY AND Z)) Q2 = NOT((Y OR Z) AND (X ANDY AND Z))
X Y Z Q1 Q2 00001
x434F4D50 x55544552
50
chapter 2 Bits, Data Types, and Operations
2.55
We have represented numbers in base-2 (binary) and in base-16 (hex). We are now ready for unsigned base-4, which we will call quad numbers. A quad digit can be 0, 1, 2, or 3.
a. What is the maximum unsigned decimal value that one can represent with 3 quad digit5?
b. What is the maximum unsigned decimal value that one can represent with n quad digits (Hint: your answer should be a function of n)’?
c. Add the two unsigned quad numbers: 023 and 221.
d. What is the quad representation of the decimal number 42?
e. What is the binary representation of the unsigned quad number 123.3?
f Express the unsigned quad number 123.3 in IEEE floating point format.
g. Given a black box which takes m quad digits as input and produces one quad digit for output, what is the maximum number of unique functions this black box can implement?
Define a new 8-bit floating point format with 1 sign bit, 4 bits of exponent, using an excess-7 code (that is, the bias is 7), and 3 bits of fraction. If xE5 is the bit pattern for a number in this 8-bit floating point format, what value does it have? (Express as a decimal number.)
2.56
ch~pte
3 Digital Logic Structures
In Chapter I, we stated that computers were built from very large numbers of very simple structures. For example, Intel’s Pentium IV microprocessor, first offered for sale in 2000, was made up of more than 42 million MOS transistors. The IBM Power PC 750 FX, released in 2002, consists of more than 38 million MOS transistors. In this chapter, we will explain how the MOS transistor works (as a logic element), show how these transistors are connected to form logic gates, and then show how logic gates are interconnected to form larger units that are needed to construct a computer. In Chapter 4, we will connect those larger units into a computer.
But first, the transistor.
3.l TheTransistor
Most computers today, or rather most microprocessors (which form the core of the computer) are constructed out of MOS transistors. MOS stands for metal-oxide semiconductor. The electrical properties of metal-oxide semiconductors are well beyond the scope of what we want to understand in this course. They are below our lowest level of abstraction, which means that if somehow transistors start misbehaving, we are at their mercy. It is unlikely that we will have any problems from the transistors.
However, it is useful to know that there are two types of MOS transistors: p-type and n-type. They both operate “logically,” very similar to the way wall switches work.
52
chapter 3
Digital Logic Structures
Figure 3.1
A simple electric circuit showing the use of a wall switch
Gate~
Figure 3.2
Gate~
120-volt power supply
(a) Drain
(b)
2.9-volt battery (power supply)
(c)
Gate~
Source
The n-type MOS transistor
Wall switch
rm
~f–
Lamp
Figure 3.1 shows the most basic of electrical circuits: a power supply (in this case, the 120 volts that come into your house), a wall switch, and a lamp (plugged into an outlet in the wall). In order for the lamp to glow, electrons must flow; in order for electrons to flow, there must be a closed circuit from the power supply to the lamp and back to the power supply. The lamp can be turned on and off by simply manipulating the wall switch to make or break the closed circuit.
Instead of the wall switch, we could use an n-type or a p-type MOS transistor to make or break the closed circuit. Figure 3.2 shows a schematic rendering of an n-type transistor (a) by itself, and (b) in a circuit. Note (Figure 3.2a) that the transistor has three terminals. They are called the gate, the source, and the drain. The reasons for the names source and drain are not of interest to us in this course. What is of interest is the fact that if the gate of then-type transistor is supplied with 2.9 volts, the connection from source to drain acts like a piece of wire. We say (in the language of electricity) that we have a closed circuit between the source and drain. If the gate of the n-type transistor is supplied with Ovolts, the connection between the source and drain is broken. We say that between the source and drain we have an open circuit.
Figure 3.2b shows the n-type transistor in a circuit with a battery and a bulb. When the gate is supplied with 2.9 volts, the transistor acts like a piece of wire,
Source
Gate 1
Drain
A p-type MOS transistor
completing the circuit and causing the bulb to glow. When the gate is supplied with O volts, the transistor acts like an open circuit, breaking the circuit, and causing the bulb not to glow.
Figure 3.2c is a shorthand notation for describing the circuit of Figure 3.2b. Rather than always showing the power supply and the complete circuit, electrical engineers usually show only the terminals of the power supply. The fact that the power supply itself provides the completion of the completed circuit is well understood, and so is not usually shown.
The p-type transistor works in exactly the opposite fashion from the n-type transistor. Figure 3.3 shows the schematic representation of a p-type transistor. When the gate is supplied with Ovolts, the p-type transistor acts (more or less) like a piece of wire, closing the circuit. When the gate is supplied with 2.9 volts, the p-type transistor acts like an open circuit. Because the p-type and n-type transistors act in this complementary way, we refer to circuits that contain both p-type and n-type transistors as CMOS circuits, for complementary metal-oxide semiconductor .
3.2 LogicGates
One step up from the transistor is the logic gate. That is, we construct basic logic structures out of individual MOS transistors. In Chapter 2, we studied the behavior of the AND, the OR, and the NOT functions. In this chapter we construct transistor circuits that implement each of these functions. The corresponding circuits are called AND, OR, and NOT gates.
3.2.1 The NOT Gate (Inverter)
Figure 3.4 shows the simplest logic structure that exists in a computer. It is con- structed from two MOS transistors, one p-type and one n-type. Figure 3.4a is the schematic representation of that circuit. Figure 3.4b shows the behavior of the circuit if the input is supplied with Ovolts. Note that the p-type transistor conducts and the n-type transistor does not conduct. The output is, therefore, connected to 2.9 volts. On the other hand, if the input is supplied with 2.9 volts, the p-type transistor does not conduct, but then-type transistor does conduct. The output in this case is connected to ground (i.e., 0 volts). The complete behavior
Figure 3.3
3.2 Logic Gates
53
54
chapter 3
Digital logic Structures {a)
In
Figure 3.4
A CMOS inverter
In
0 volts 2.9 volts
Out
”
‘ l n-type
” ‘
I’
(d)
In Out
0 1
1 0
2.9 volts
Out
0 volts
(c)
2.9 volts 0 volts
” ” -”
In =0
(b)
‘
‘: p-type Out= 1
means of a table, as shown in Figure 3.4c. If Oand 2.9 volts by the symbol I, we have the
of the circuit can be described by
we replace Ovolts by the symbol
truth table (Figure 3.4d) for the complement or NOT function, which we studied in Chapter 2.
In other words, we have just shown how to construct an electronic circuit that implements the NOT logic function discussed in Chapter 2. We call this circuit a NOT gate, or an inverter.
3.2.2 OR and NOR Gates
Figure 3.5 illustrates a NOR gate. Figure 3.5a is a schematic of a circuit that implements a NOR gate. It contains two p-type and two n-type transistors.
Figure 3.5b shows the behavior of the circuit if A is supplied with Ovolts and B is supplied with 2.9 volts. In this case, the lower of the two p-type transistors produces an open circuit, and the output C is disconnected from the 2.9°volt power supply. However, the leftmost n-type transistor acts like a piece of wire,
connecting the output C to Ovolts.
Note that if both A and Bare supplied with Ovolts, the two p-type transistors
conduct, and the output C is connected to 2.9 volts. Note further that there is no ambiguity here, since both n-type transistors act as open circuits, and so C is disconnected from ground.
I f either A or B is supplied with 2.9 volts, the corresponding p-type transistor results in an open circuit. That is sufficient to break the connection from C to
(a)
A-~—-rl
(b)
A= a-~~—–‘, J p-type ·- .
B=1-~’——-‘• ) p-type I—–C=0
n-type ,,- “,
‘l ‘·- .’
(d)
ABC 0 0 1
0 1 0 1 0 0 1 1 0
(c)
ABC
0 volts
Ovolts 2.9 volts 2.9 volts
Ovolts 2.9 volts 0 volts 2.9 volts
2.9 volts 0 volts Ovolts Ovolts
Figure 3.5 The NOR gate
..____ c
the 2.9-volt source. However, 2.9 volts supplied to the gate of one of the n-type transistors is sufficient to cause that transistor to conduct, resulting in C being connected to ground (i.e., 0 volts).
Figure 3.5c summarizes the complete behavior of the circuit of Figure 3.5a. It shows the behavior of the circuit for each of the four pairs of voltages that can be supplied to A and B. That is,
A= 0 volts, A= 0 volts, A =2.9volts, A = 2.9 volts,
B =0 volts B = 2.9 volts B =0volts B = 2.9 volts
If we replace the voltages with their logical equivalents, we have the truth table of Figure 3.5d. Note that the output C is exactly the opposite of the logical OR function that we studied in Chapter 2. In fact, it is the NOT-OR function, more typically abbreviated as NOR. We refer to the circuit that implements the NOR function as a NOR gate.
If we augment the circuit of Figure 3.5a by adding an inverter at the output, as shown in Figure 3.6a, we have at the output D the logical function OR. Figure 3.6a is the circuit for an OR gate. Figure 3.6b describes the behavior of this circuit if the input variable A is set to Oand the input variable B is set to 1. Figure 3.6c shows the circuit’s truth table.
3.2 Logic Gates 55
56
chapter 3
Digital Logic Structures
Figure 3.6
The OR gate
A——–o
B–+-~——-0
,- -, p-type
‘ ·,
,, ,,’
,- -,,n-type ‘
(a)
(b)
A=o,__,_____ ,’ -,,p-type
)
,_,
,, -,,p-type B=1–1——- •
,, -,, n-type,, –_
” ‘t’
C
3.2.3 AND and NAND Gates
D
(c)
ABCD 0010 0101 1001 1101
n-type’ _,
Figure 3,7 shows an AND gate. Note that if either A or B is supplied with Ovolts, there is a direct connection from C to the 2.9-volt power supply, The fact that C is at 2.9 volts means the n-type transistor whose gate is connected lo C provides a path from D to ground. Therefore, if either A or B is supplied with Ovolts, the output D of the circuit of Figure 3.7 is Ovolts.
Again, we note that there is no ambiguity. The fact that at least one of the two inputs A or B is supplied with Ovolts means that at least one of the two n-type transistors whose gates arc connected to A or B is open, and that consequently, C is disconnected from ground. furthermore, the fact that C is at 2.9 volts means the p-type transistor whose gate is connected to C is open-circuited. Therefore, D is not connected to 2.9 volts.
On the other hand, if both A and B are supplied with 2.9 volts, then both of their corresponding p-type transistors are open. However, their corresponding n-type transistors act like pieces of wire, providing a direct connection from C to ground. Because C is at ground, the rightmost p-type transistor acts like a closed circuit, forcing D to 2.9 volts.
A
1————–,
I I
ABCD
0D1D D11D 1D1D 1101
I
I
: NANO _________ _ Figure 3.7 The AND gate
3.2 Logic Gates (a) (b)
57
BD
C
Figure 3.7b summarizes in truth table form the behavior of the circuit of Figure 3.7a. Note that the circuit is an AND gate. The circuit shown within the dashed lines (i.e., having output C) is a NITT-AND gate, which we generally abbreviate as NAND.
The gates just discussed are very common in digital logic circuits and in digital computers. There are millions of inverters (NOT gates) in the Pentium lV microprocessor. As a convenience, we can represent each of these gates by stan- dard symbols, as shown in Figure 3.8. The bubble shown in the inverter, NAND, and NOR gates signifies the complement (i.e., NOT) function.
From now on, we will not draw circuits showing the individual transistors. Instead, we will raise our level of abstraction and use the symbols shown m Figure 3.8.
Figure 3.8
=D-
(a) Inverter (b) AND gate (c) OR gate
=D-
(d) NANO gate (e) NOR gate
Basic logic gates
=[)-
58
chapter 3
Digital Logic Structures
Figure 3.9
DeMorgan’s law
3.2.4 DeMorgan’s Law
(a)
(b) 0~
B=1 01 10
(c)
A B A B AND C
001110 011001 100101 110001
A=1
C=1
Note (see Figure 3.9a) that one can complement an input before applying it to a gate. Consider the effect on the two-input AND gate ifwe apply the complements of A and B as inputs to the gate, and also complement the output of the AND gate. The “bubbles” at the inputs to the AND gate designate that the inputs A and B arc complemented before they are used as inputs to the AND gate.
Figure 3.9b shows the behavior of this structure for the input combination A = 0, B = I. For ease of representation, we have moved the bubbles away from the inputs and the output of the AND gate. That way, we can more easily see what happens to each value as it passes through a bubble.
Figure 3.9c summarizes by means of a truth table the behavior of the logic circuit of Figure 3.9a fo! all four combinations of input values. Note that the NOT o [ A is represented as A.
We can describe the behavior of this circuit algebraically:
AANDB =AORB
We can also state this behavior in English:
“It is not the case that both A and B are false” is equivalent to saying “At
least one of A and B is true.”
This equivalence is known as DeMorgan’s law. Is there a similar result if one
inverts both inputs to an OR gate, and then inverts the output?
3.2.5 Larger Gates
Before we leave the topic of logic gates, we should note that the notion of AND, OR, NAND, and NOR gates extends to larger numbers of inputs. One could build a three-input AND gate or a four-input OR gate, for example. Ann-input AND gate has an output value of 1 only if ALL n input variables have values of I. If any of the n inputs has a value of 0, the output of then-input AND gate is 0. An n-input OR gate has an output value of 1 if ANY of then input variables has a value of I. That is, an n-input OR gate has an output value of Oonly if ALL n input variables have values of 0.
Figure 3.10
(a)
A B C OUT
0000 0010 0100 0110 1000 1010 1100 1111
A three-input AND gate
(b)
~=[)-our
Figure 3.10 illustrates a three-input AND gate. Figure 3.10a shows its truth table. Figure 3.10b shows the symbol for a three-input AND gate.
Can you draw a transistor-level circuit for a three-input AND gate? How about a four-input AND gate? How about a four-input OR gate?
3.3 CombinationalLogicCircuits
Now that we understand the workings of the basic logic gates, the next step is to build some of the logic structures that are important components of the microarchitecture of a computer.
There are fundamentally two kinds of logic structures, those that include the storage of information and those that do not. In Sections 3.4, 3.5, and 3.6, we will deal with structures that store information. In this section, we will deal with those that do not. These structures are sometimes referred to as decision elements. Usually, they are referred to as combinational logic structures, because their outputs are strictly dependent on the combination of input values that are being applied to the structure right now. Their outputs are not at all dependent on any past history of information that is stored internally, since no information can
be stored internally in a combinational logic circuit.
We will next examine a decoder, a mux, and a full adder. ·
3.3.1 Decoder
Figure 3.11 shows a logic gate description of a two-input decoder. A decoder has the property that exactly one of its outputs is I and all the rest are Os. The one output that is logically I is the output corresponding to the input pattern that it is expected to detect. In general, decoders have n inputs and zn out- puts. We say the output line that detects the input pattern is asserted. That is, that output line has the value I, rather than Oas is the case for all the other output lines. In Figure 3.11, note that for each of the four possible combina- tions of inputs A and B, exactly one output has the value I at any one time. In Figure 3.11 b, the input to the decoder is I0, resulting in the third output line being asserted.
The decoder is useful in determining how to interpret a bit pattern. We will see in Chapter 5 that the work to be carried out by each instruction in the LC-3 is
3.3 Combinational Logic Circuits
59
60
chapter 3 Digital Logic Structures
(a) (b)
A–~–a A~1–~–a B -.–+—a B ~ o-.–+—a
Figure 3.11 A two-input decoder
determined by a four-bit pattern, called an opcode, that is part of the instruction. A 4-to-16 decoder is a simple combinational logic structure for identifying what work is to be performed by each instruction.
3.3.2 Mux
Figure 3.12a shows a gate-level description of a two-input multiplexer, more commonly referred to as a mux. The function of a mux is to select one of the inputs and connect itto the output. The select signal (Sin Figure 3.12) determines which input is connected to the output. The mux of Figure 3.12 works as follows: Suppose S = 0, as shown in Figure 3.12b. Since the output of an AND gate is 0 unless all inputs are 1, the output of the rightmost AND gate is 0. Also, the output of the leftmost AND gate is whatever the input A is. That is, if A = 0, then the output of the leftmost AND gate is 0, and if A = 1, then the output is 1. Since the output of the rightmost AND gate is 0, it has no effect on the OR gate. Consequently, the output at C is exactly the same as the output of the leftmost AND gate. The net result of all this is that if S = 0, the output C is identical to the input A.
On the other hand, if S = 1, it is B that is ANDed with 1, resulting in the output of the OR gate having the value of B.
In summary, the output C is always connected tu either the input A or the input B-which one depends on the value of the select line S. We say S selects the source of the mux (either A or B) to be routed through to the output C. Figure 3. l 2c shows the standard representation for a mux.
In general, a mux consists of2n inputs andn select lines. Figure 3.13a shows a gate-level description ofafour-input mux. It requires two select lines. Figure 3. 13b shows the standard representation for a four-input mux.
Can you construct the gate-level representation for an eight-input mux? How many select lines must you have’?
(a) (b)
AB AB
C
Figure 3.12
A
C
Figure 3.13
OUT
A four-input mux
3.3.3 Full Adder
A 2-to-1 mux
(a)
In Chapter 2, we discussed binary addition. Recall that a simple algorithm for binary addition is to proceed as you have always done in the case of decimal addition, from right to left, one column at a time, adding the two digits from the two values plus the carry in, and generating a sum digit and a carry to the next column. The only difference is you get a carry after 1, rather than after 9.
Figure 3.14 is a truth table that describes the result of binary addition on one column of bits within two n-bit operands. At each column, there are three values that must be added: one bit from each of the two operands and the carry from the previous column. We designate these three bits as a;, b;, and carry;. There are two results, the sum bit (s;) and the carryover to the next column, carry;+i. Note that if only one of the three bits equals 1, we get a sum of 1, and no carry (i.e., carry;+i = 0). If two of the three bits equal 1, we get a sum of 0, and a carry
3.3 Combinational Logic Circuits (c)
61
AB
C
(b)
AB CD
1111
OUT
K>—S
62
chapter 3
Digital Logic Structures
Figure 3.14
a ; b; carry; carryi+ 1 S;
00000 00101 01001 01110 10001 10110 11010 11111
A truth table for a binary adder
of 1. If all three bits equal 1, the sum is 3, which in binary addition corresponds to a sum of 1 and a carry of 1.
Figure 3.15 is the gate-level description of the truth table of Figure 3.14. Note that each AND gate in Figure 3.15 produces an output 1 for exactly one of the eight input combinations ofa;, b;, and carry;. The output of the OR gate for C;+1 must be 1 in exactly those cases where the corresponding input combinations in Figure 3.14 produce an output 1. Therefore the inputs to the OR gate that generates C;+1 are the outputs of the AND gates corresponding to those input combinations. Similarly, the inputs to the OR gate that generates S; are the outputs of the AND gates corresponding to the input combinations that require an output 1 for S; in the truth table of Figure 3.14.
Figure 3.15 Gate-level description of a ful I adder
C1;-1
S;
Circuit of Figure 3.15
Figure 3.16
Circuit of Figure 3.15
Circuit of Figure 3.15
s,
Circuit of Figure 3.15
c,
0
A circuit for adding two 4-bit binary numbers
Note that since the input combination 000 does not result in an output 1 for either C;+J or S;, its corresponding AND gate is not an input to either of the two OR gates.
We call the logic circuit of Figure 3.15 that provides three inputs (a;, b;, and carry;) and two outputs (the sum bits; and the carryover to the next column carry;+1) afull adder.
Figure 3.16 illustrates a circuit for adding two 4-bit binary numbers, using four of the full adder circuits of Figure 3.15. Note that the carryout of column i is an input to the addition performed in column i + 1.
3.3.4 The Programmable Logic Array (PLA)
Figure 3.17 illustrates a very common building block for implementing any collec- tion oflogic functions one wishes to. The building block is called a programmable logic array (PLA). It consists of an array of AND gates (called an AND array) followed by an array of OR gates (called an OR array). The number of AND gates corresponds to the number of input combinations (rows) in the truth table. For n input logic functions, we need a PLA with 2n n-inputAND gates. In Figure 3.17, we have 23 3-input AND gates. The number of OR gates corresponds to the number of output columns in the truth table. The implementation algorithm is simply to connect the output of an AND gate to the input of an OR gate if the
corresponding row of the truth table produces an output I for that output column. Hence the notion of programmable. That is, we say we program the connec- tions from AND gate outputs to OR gate inputs to implement our desired logic functions.
Figure 3.15 showed eight AND gates connected to two OR gates since our requirement was to implement two functions (sum and carry) of three input vari- ables. Figure 3.17 shows a PLA that can implement any four functions of three variables one wishes to, by appropriately connecting AND gate outputs to OR gate inputs.
3.3 Combinational Logic Circuits A1 81
63
chapter 3
Digital Logic Structures
Figure 3.17
Connections
A programmable logic array
w
X
y
z
3.3.S Logical Completeness
Before we leave the topic of combinational logic circuits, it is worth noting an important property of building blocks for logic circuits: logical completeness. We showed in Section 3.3.4 that any logic function we wished to implement could be accomplished with a PLA. We saw that the PLA consists of only AND gates, OR gates, and inverters. That means that any logic function we wish to implement can be accomplished, provided that enough AND, OR, and NOT gates are available. We say that the set of gates {AND, OR, NOT) is logically complete because we can build a circuit to carry out the specification of any truth table we wish without using any other kind of gate. That is, the set of gates {AND, OR, and NOT) is logically complete because a barrel of AND gates, a barrel of OR gates, and a barrel of NOT gates are sufficient to build a logic circuit that carries out the specification of any desired truth table. The barrels may have to be big ones, but the point is, we do not need any other kind of gate to do the job.
3.4 BasicStorageElements
Recall our statement at the beginning of Section 3.3 that there are two kinds of logic structures, those that involve the storage of information and those that do not. We have discussed three examples of those that do not: the decoder, the mux, and the full adder. Now we are ready to discuss logic structures that do include the storage of information.
3.4.1 The R-S Latch
A simple example of a storage element is the R-S latch. It can store one bit of information. The R-S latch can be implemented in many ways, the simplest being
the one shown in Figure 3.18. Two 2-input NAND gates are connected such that the output of each is connected to one of the inputs of the other. The remaining inputs S and R are normally held at a logic level I.
The R-S latch works as follows: We start with what we call the quiescent (or quiet) state, where inputs Sand R both have logic value 1. We consider first the case where the output a is I. Since that means the input A equals I (and we know the input R equals I since we are in the quiescent state), the output b must be 0. That, in turn, means the input B must be 0, which results in the output a equal to
1. As long as the inputs Sand R remain I, the state of the circuit will not change. We say the R-S latch stores the value 1 (the value of the output a).
If, on the other hand, we assume the output a is 0, then the input A must be 0, and the output b must be I. This, in tum, results in the input B equal to I, and combined with the input S equal to I (again due to quiescence) results in the output a equal to 0. Again, as long as the inputs S and R remain 1, the state of the circuit will not change. In this case, we say the R-S latch stores the value 0.
The latch can be set to I by momentarily setting S to 0, provided we keep the value of R at I. Similarly, the latch can he set to 0 by momentarily setting R to 0, provided we keep the value of S at I. We use the term set to denote setting a variable to 0 or I, as in “set to 0” or “set to I.” In addition, we often use the term clear to denote the act of setting a variable to 0.
I f we clear S, then a equals 1, which in turn causes A to equal 1. Since R is also 1, the output at b must be 0. This causes B to be 0, which in turn makes a equal to I. If we now return S to 1, it does not affect a, since B is also 0, and only one input to a NAND gate must be 0 in order to guarantee that the output of the NAND gate is I. Thus, the latch continues to store a I long after S returns to 1.
In the same way, we .can clear the latch (set the latch to 0) by momentarily setting R to 0.
We should also note that in order for the R-S latch to work properly, one must take care that it is never the case that both S and R are allowed lo be set to 0 at the same time. If that does happen, the outputs a and b are both I, and the final state of the latch depends on the electrical properties of the transistors making up the gates and not on the logic being performed. How the electrical properties of the transistors will determine the final state in this case is a subject we will have to leave for a later semester.
Figure 3.18
An R-S latch
S—-,
B
A
a
R—-, b
3.4 Basic Storage Elements 65
66
chapter 3 Digital Logic Structures 3.4.2 The Gated D Latch
To be useful, it is necessary to control when a latch is set and when it is cleared. A simple way to accomplish this is with the gated latch.
Figure 3.19 shows a logic circuit that implements a gated D latch. It consists ofthe R-S latch ofFigure 3.18, plus two additional gates that allow the latch to be set to the value of D, but only when WE is asserted. WE stands for write enable. When WE is not asserted (i.e., when WE equals 0), the outputs S and R are both equal to I. Since Sand Rare also inputs to the R-S latch, if they are kept at 1, the value stored in the latch remains unchanged, as we explained in Section 3.4.1. When WE is momentarily asserted (i.e., set to I), exactly one of the outputs S or R is set to 0, depending on the value of D. If D equals 1, then S is set to 0. If D equals 0, then both inputs to the lower NAND gate are 1, resulting in R being set to 0. As we saw earlier, if S is set to 0, the R-S latch is set to I. If R is set to 0, the R-S latch is set to 0. Thus, the R-S latch is set to I or Oaccording to whether D is I or 0. When WE returns to 0, S and R return to 1, and the value stored in the R-S latch persists.
3.4.3 A Register
We have already seen in Chapter 2 that it is useful to deal with values consisting of more than one bit. In Chapter 5, we will introduce the LC-3 computer, where most values are represented by 16 bits. It is useful to be able to store these larger numbers of bits as self-contained units. The register is a structure that stores a number of bits, taken together as a unit. That number can be as large as is useful or as small as I. In the LC-3, we will need many 16-bit registers, and also a few one-bit registers. We will see in Figure 3.33, which describes the internal structure of the LC-3, that PC, IR, and MAR are all 16-bit registers, and that N, Z, and P are all one-bit registers.
Figure 3.20 shows a four-bit register made up of four gated D latches. The four-bit value stored in the register is Q3, Q2, Qi, Qo. The value D3, Dz, D1, Do can be written into the register when WE is asserted.
Note: A common shorthand notation to describe a sequence of bits that are numbered as just described is Q[3:0]. That is, each bit is assigned its own bit number. The rightmost bit is bit [OJ, and the numbering continues from right to
D–<>————J
WE
s
R
Figure 3.19 A gated D latch
Figure 3.20 A four-bit register
D2
3.5 The Concept of Memory 67 o,
02 o,
left.Iftherearenbits,theleftmostbitisbit[n- 1].Forexample,inthefollowing 16-bit pattern,
0011101100011110
bit [15] is 0, bit [14] is 0, bit [13] is I, bit [12] is 1, and so on.
We can designate a subunit of this pattern with the notation Q[l:r], where l is the leftmost bit in the subunit and r is the rightmost bit in the subunit. We call
such a subunit afield.
In this 16-bit pattern, if A[l5:0] is the entire 16-bit pattern, then, for example:
A[l5:12] is 0011
A [13: 7] is 1110110 A[2:0] is llO A[l:l] is 1
We should also point out that the numbering scheme from right to left is purely arbitrary. We could just as easily have designated the leftmost bit as bit [OJ and numbered them from left to right. Indeed, many people do. So, it is not important whether the 1;mmbering scheme is left to right or right to left. But it is important that the bit numbering be consistent in a given setting, that is, that it is always done the same way. In our work, we will always number bits from right to left.
3.5 TheConceptofMemor~
We now have all the tools we need to describe one of the most important structures in the electronic digital computer, its memory. We will see in Chapter 4 how memory fits into the basic scheme of computer processing, and you will see throughout the rest of the book and indeed the rest of your work with computers how important the concept of memory is to computing.
Memory is made up of a (usually large) number of locations, each uniquely identifiable and each having the ability to store a value. We refer to the unique
68 chapter 3 Digital Logic Structures
identifier associated with each memory location as its address. We refer to the number of bits of information stored in each location as its addressability.
For example, an advertisement for a personal computer might say, “This computer comes with 16 megabytes of memory.” Actually, most ads generally use the abbreviation 16 MB. This statement means, as we will explain momentarily, that the computer system includes 16 million memory locations, each containing
I byte of information.
3.5.1 Address Space
We refer to the total number of uniquely identifiable locations as the memory’s address ,pace. A 16 MB memory, for example, refers to a memory that consists of 16 million uniquely identifiable memory locations.
Actually, the number 16 million is only an approximation, due to the way we identify memory locations. Since everything else in the computer is represented by sequences of Os and ls, it should not be surprising that memory locations are iden- tified by binary addresses as well. With n bits of address, we can uniquely identify 2″ locations. Ten bits provide 1,024 locations, which is approximately 1,000. If we have 20 bits to represent each address, we have 220 uniquely identifiable loca- tions, which is approximately 1 million. Thus 16 mega really corresponds to the number of uniquely identifiable locations that can be specified with 24 address bits. We say the address space is 224, which is exactly 16,777,216 locations, rather than 16,000,000, although we colloquially refer to it as 16 million.
3.5.2 Addressability
The number of bits stored in each memory location is the memory’s address- ability. A 16 megabyte memory is a memory consisting of 16,777,216 memory locations, each containing 1 byte (i.e., 8 bits) of storage. Most memories are byte- addressable. The reason is historical; most computers got their start processing data, and one character stroke on the keyboard corresponds to one 8-bit ASCII character, as we learned in Chapter 2. lf the memory is byte-addressable, then each ASCII code occupies one location in memory. Uniquely identifying each byte of memory allowed individual bytes of stored information to be changed easily.
Many computers that have been designed specifically to perform large scien- tific calculations are 64-bit addressable. This is due to the fact that numbers used in scientific calculations are often represented as 64-bit floating point quantities. Recall that we discussed the floating point data type in Chapter 2. Since scientific calculations arc likely to use numbers that require 64 bits to represent them, it is reasonable to design a memory for such a computer that stores one such number in each uniquely identifiable memory location.
3.5.3 A 22-by-3-Bit Memory
Figure 3.21 illustrates a memory of size 22 by 3 bits. That is, the memory has an address space of four locations, and an addressability of 3 bits. A memory of size 22 requires 2 bits to specify the address. A memory ofaddressability 3 stores 3 bits
3.5 The Concept of Memory 69 A[1:0] 0;[2] 0;[1] O;[O]
Figure3.21 A22-by-3-bitmemory
0[2] 0[1]
O[O]
of information in each memory location. Accesses of memory require decoding the address bits. Note that the address decoder takes as input A[l:0] and asserts exactly one of its four outputs, corresponding to the word line being addressed. In Figure 3.21, each row of the memory corresponds to a unique three-bit word; thus the term word line. Memory can he read by applying the address A[l:0], which asserts the word line to be read. Note that each bit of the memory is ANDed with its word line and then ORed with the corresponding bit~ of the other words. Since only one word line can be asserted at a time, this is effectively a mux with the output of the decoder providing the select function to each bit line. Thus, the appropriate word is read.
Figure 3.22 shows the process of reading location 3. The code for 3 is 11. The address A[l:0l = 11 is decoded, and the bottom word line is asserted. Note that the three other decoder outputs are not asserted. That is, they have the value 0. The value stored in location 3 is 101. These three bits are each ANDed with their word line producing the bits IO1, which arc supplied to the three output OR gates. Note that all other inputs to the OR gates are 0, since they have been produced by ANDing with unassertcd word lines. The result is that D[2:0] = IOl. That is, the value stored in location 3 is output by the OR gates. Memory can be written in a similar fashion. The address specified by A[l :OJ is presented to
70
chapter 3 Digital Logic Structures
A[1 :OJ D;[2l D;[1l D;[Ol 11
wE FV0Lr
0 0 -~ i=
-~!~~1
–
o~
–
R)OL,-
~olr
Ort ~ It ro IT ~
~ I– 1–
-1
– Lr,
01 ~1 rt t _1_
A= 11 WE=O
‘rt 12 rt rJi_ IT ~ ~ ‘– 1 —
–
‘
;.-
1~~~1 101
D[2l D[1) D[Ol Figure 3.22 Reading location 3 in our 22-by-3-bit memory
the address decoder, resulting in the correct word line being asserted. With WE asserted as well, the three bits D; [2:0] can be written into the three gated latches corresponding to that word line.
3.6 Sequential~ogicCircuits
In Section 3.3, we discussed digital logic structures that process information (decision structures, we call them) wherein the outputs depend solely on the values that are present on the inputs now. Examples are muxes, decoders, and full adders. We call these structures combinational logic circuits. In these circuits there is no sense of the past. Indeed, there is no capability for storing any information of anything that happened before the present time. In Sections 3.4 and 3.5, we described structures that do store information-in Section 3.4, some basic storage elements, and in Section 3.5, a simple 22-by-3-bit memory.
In this section, we discuss digital logic structures that can both process infor- mation (i.e., make decisions) and store information. That is, these structures base their decisions not only on the input values now present, but also (and this is
j
Figure 3.23
– Storage I.- elements
Sequential logic circuit block diagram
Input
—
Combinational logic circuit
. Output
very important) on what has happened before. These structures are usually called sequential logic circuits: They are distinguishable from combinational logic cir- cuits because, unlike combinational logic circuits, they contain storage elements that allow them to keep track of prior history information. Figure 3.23 shows a block diagram of a sequential logic circuit. Note the storage elements. Note, also, that the output can be dependent on both the inputs now and the values stored in the storage elements. The values stored in the storage elements reflect the history of what has happened before.
Sequential logic circuits are used to implement a very important class of mechanisms called finite state machines. We use finite state machines in essen- · tially all branches of engineering. For example, they are used as controllers of electrical systems, mechanical systems, aeronautical systems, and so forth. A traf- fic light controller that sets the traffic light to red, yellow, or green depends on the light that is currently on (history information) and input information from sensors such as trip wires on the road and optical devices that are monitoring traffic.
We will see in Chapter 4 when we introduce the von Neumann model of a computer that a finite state controller is at the heart of the computer. It controls the processing of information by the computer.
3.6.1 A Simple Example: The Combination Lock
A simple example shows the difference between combinational logic structures and sequential logic structures. Suppose one wishes to secure a bicycle with a lock, but does not want to carry a key. A common solution is the combination lock. The person memorizes a “combination” and uses this to open the lock. Two common types of locks are shown in Figure 3.24.
In Figure 3.24a, the lock consists of a dial, with the numbers from Oto 30 equally spaced around its circumference. To open the lock, one needs to know the “combination.” One such combination could be: Rl3-L22-R3. If this were the case, one would open the lock by turning the dial two complete turns to the right, and then continuing until the dial points to 13, followed by one complete tum to the left, and then continuing until the dial points to 22, followed by turning the dial again to the right until it points to 3. At that point, the lock opens. What is important here is the sequence of the turns. The lock will not open, for example if one performed two turns to the right, and then stopped on 20, followed by one
‘—
3.6 Sequential Logic Circuits 71
72
chapter 3
Digital Logic Structures
Figure 3.24
Combination locks
30 250 5
4
(b)
20
10
15 /a)
complete turn to the left, ending on 22, followed by one turn to the right, ending on 3. That is, even though the final position of the dial is 3, the lock would not open. Why? Because the lock stores the previous rotations and makes its decision (open or don’t open) on the basis of the current input value (R3) and the history of the past operations. This mechanism is a simple example of a sequential structure.
Another type of lock is shown in Figure 3.24b. The mechanism consists of (usually) four wheels, each containing the digits Othrough 9. When the digits are lined up properly, the lock will open. In this case, the combination is the set of four digits. Whether or not this lock opens is totally independent of the past rotations of the four wheels. The lock does not care at all about past rotations.
The only thing important is the current value of each of the four wheels. This is a simple example of a combinational structure.
It is curious that in our everyday speech, both mechanisms are referred to as “combination locks.” In fact, only the lock of Figure 3.24b is a combinational lock. The lock of Figure 3.24a would be better called a sequential lock!
3.6.2 The Concept of State
For the mechanism of Figure 3.24a to work properly, it has to keep track of the sequence of rotations leading up to the opening of the lock. In particular, it has to differentiate the correct sequence R l 3-L22-R3 from all other sequences. For example, Rl3-L29-R3 must not be allowed to open the lock. Likewise, Rl0-L22- R3 must also not be allowed to open the lock. The problem is that, at any one time, the only external input to the lock is the current rotation.
For the lock of Figure 3.24a to work, it must identify several relevant situations, as follows:
A. The lock is not open, and NO relevant operations have been performed.
B. The lock is not open, but the user has just completed the Rl3 operation.
C. The lock is not open, but the user has just completed R13, followed by L22.
D. The lock is open.
We have labeled these four situations A, B, C, and D. We refer to each of these situations as the state of the lock.
The notion of state is a very important concept in computer engineering, and actually, in just about all branches of engineering. The state of a mechanism- more generally, the state of a system-is a snapshot of that system in which all relevant items are explicitly expressed.
That is: The state ofa system is a snapshot ofall the relevant elements ofthe system at the moment the snapshot is taken.
In the case of the lock of Figure 3.24a, there are four states A, B, C, and D. Either the lock is open (State D), or if it is not open, we have already performed either zero (State A), one (State B), or two (State C) correct operations. This is the sum total of all possible states that can exist. Exercise: Why is that the case? That is, what would be the snapshot of a fifth state that describes a possible situation for the combination lock?
There are many common examples of systems that can be easily described by means of states.
The state of a game of basketball can be described by the scoreboard in the basketball arena. Figure 3.25 shows the state of the basketball game as Texas 73, Oklahoma 68, 7 minutes and 38 seconds left in the second half, 14 seconds left on the shot dock, Texas with the ball, and Texas and Oklahoma each with four team fouls. This is a snapshot of the basketball game. It describes the state of the basketball game right now. If, 12 seconds later, a Texas player were to score a two-point shot, the new state would be described by the updated scoreboard. That is, the score would then be Texas 75, Oklahoma 68, the time remaining in the game would be 7 minutes and 26 seconds, the shot clock would be back to 25 seconds, and Oklahoma would have the ball.
The game of tic-tac-toe can also be described in accordance with the notion of state. Recall that the game is played by two people (or, in our case, a person and the computer). The state is a snapshot of the game in progress each time the computer asks the person to make a move. The game is played as follows: There
Figure 3.25
An example of a state
TEXAS FOULS: 4
7=3
I•
ol OKLAHOMA FOULS: 4
58
HALF
0
IY
12
•
SHOT CLOCK
3.6 Sequential Logic Circuits 73
74 chapter 3 Digital Logic Structures
### (a) (b) (c)
Figure 3.26 Three states in a tic-tac-toe machine
are nine locations on the diagram. The person and then the computer take turns placing an X (the person) and an O (the computer) in an empty location. The person goes first. The winner is the first to place three symbols (three Xs for the person, three Os for the computer) in a straight line, either vertically, horizontally, or diagonally.
The initial state, before either the person or computer has had a turn, is shown in Figure 3.26a. Figure 3.26b shows a possible state of the game when the person is prompted for a second move, if he/she put an X in the upper left corner as the first move. In the state shown, the computer put an O in the middle square as its first move. Figure 3.26c shows a possible state of the game when the person is being prompted for a third move if he/she put an X in the upper right corner on the second move (after putting the first X in the upper left corner). In the state shown, the computer put its second O in the upper middle location.
3.6.3 Finite State Machines
We have seen that a state is a snapshot of all relevant parts of a system at a particular point in time. At other times, that system can be in other states. The behavior of a system can often be best understood by describing it as a finite state machine.
A finite state machine consists of five elements:
1. a finite number of states
2. a finite number of external inputs
3. a finite number of external outputs
4. an explicit specification of all state transitions
5. an explicit specification of what determines each external
output value.
The set of states represents all possible situations (or snapshots) that the
system can be in. Each state transition describes what it takes to get from one state to another.
The State Diagram
A finite state machine can be conveniently represented by means of a state dia- gram. Figure 3.27 is an example of a state diagram. A state diagram is drawn as a set of circles, where each circle corresponds to one state, and a set of connections
X Output 1–
101
Figure 3.27 A state diagram
0- – – – – – – – – – i
between some of the states, where each connection is drawn as an arrow. The more sophisticated term for “connection” is arc. Each arc identifies the transition from one state to another. The arrowhead on each arc specifies which state the system is coming from, and which state it is going to. We refer to the state the system is coming from as the current state, and the state it is going to as the next state. The finite state machine represented by the state diagram of Figure 3.27 consists of three states, with six state transitions. Note that there is no state transition from state Y to state X.
It is often the case that from a current state there are multiple transitions to next states. The state transition that occurs depends on the values of the external inputs. In Figure 3.27, if the current state is state X and the external input has value 0, the next state is state Y. If the current state is state X and the external input has the value I, the next state is state Z. In short, the next state is determined by the combination of the current state and the current external input.
The output values of a system can be determined just by the current state of the system, or they can be determined by the combination of the current state and the values of the current external inputs. In all the cases we will study, the output values are specified by the current state of the system. In Figure 3.27, the output is
IOI when the system is in state X, the output is 110 when the system is in state Y, and 00 I when the system is in state Z.
Figure 3.28 is a state diagram of the combination lock of Figure 3.24a, for which the correct combination is RI3, L22, R3. Note the four states, labeled A, B, C, D, identifying whether the lock is open, or, in the cases where it is not open, the number of correct rotations performed up to now. The external inputs are the possible rotation operations. The output is the condition “open” or “do not open.” The output is explicitly associated with each state. That is, in states A, B, and C, the output is “do not open.” ln state D, the output is “open.” Note further that the “arcs” out of each state comprise all possible operations that one could perform when the mechanism is in that state. For example, when in state B, all possible rotations can be described as (I) L22 and (2) everything except L22. Note that there are two arrows emanating from state B in Figure 3.28, corresponding to these two cases.
We could similarly draw a state diagram for the basketball game we described earlier, where each state would be one possible configuration of the scoreboard. A transition would occur ifeither the referee blew a whistle or the other team got the
3.b Sequential Logic Circuits 75
76
chapter 3
Digital Logic Structures
Figure 3.28
R13
Other Other than
than R3 R13
R3
R13
State diagram of the combination lock of Figure 3.24
Other than R13
ball. We showed earlier the transition that would be caused by Texas scoring a two- point shot. Clearly, the number of states in the finite state machine describing a basketball game would be huge. Also clearly, the number of legitimate transitions from one state to another is small, compared to the number of arcs one could draw connecting arbitrary pairs of states. The input is the activity that occurred on the basketball court since the last transition. Some input values are: Texas scored two points, Oklahoma scored three points, Texas stole the ball, Oklahoma successfully rebounded a Texas shot, and so forth. The output is the final result of the game. The output has three values: Game still in progress, Texas wins, Oklahoma wins.
Can one have an arc from a state where the score is Texas 30, Oklahoma 28 to a state where the score is tied, Texas 30, Oklahoma 30? See Exercise 3.38.
Is it possible to have two states, one where Texas is ahead 30-28 and the other where the score is tied 30-30, but no arc between the two? See Exercise 3.39.
The Clock
There is still one important property of the behavior of finite state machines that we have not discussed-the mechanism that triggers the transition from one state to the next. In the case of the “sequential” combination lock, the mechanism is the completion of rotating the dial in one direction, and the start of rotating the dial in the opposite direction. In the case of the basketball game, the mechanism is triggered by the referee blowing a whistle, or someone scoring or the other team otherwise getting the ball.
Frequently, the mechanism that triggers the transition from one state to the next is a clock circuit. A clock circuit, or, more commonly, a clock, is a signal whose value alternates between Ovolts and some specified fixed voltage. In digital logic terms, a clock is a signal whose value alternates between Oand 1. Figure 3.29
Other than L22
L22
.r
0
~
I ONE I ONE I ONE I
Figure 3.29
A clock signal
Figure 3.30
A traffic danger sign
‘
CLOCK ‘ CYCLE
CLOCK ‘ CYCLE
CLOCK ‘ CYCLE
DANGER
MOVE
RIGHT
illustrates the value of the clock signal as a function of time. A clock cycle is one interval of the repeated sequence of intervals shown in Figure 3.29.
In electronic circuit implementations of a finite state machine, the transition from one state to another occurs at the start of each clock cycle.
3.6.4 An Example: The Complete Implementation of a Finite State Machine
We conclude this section with the logic specification of a sequential logic circuit that implements a finite state machine. Our example is a controller for a traffic danger sign, as shown in Figure 3.30. Note the sign says, “Danger, Move Right.” The sign also contains five lights (labeled I through 5 in the figure).
Like many sequential logic circuits, the purpose of our controller is to direct the behavior of a system. In our case, the system is the set of lights on the traffic danger sign. The controller’s job is to have the five lights flash on and off as follows: During one cycle, all lights will be off. The next cycle, lights I and 2 will be on. The next cycle, lights I, 2, 3, and 4 will be on. The next cycle, all five lights will be on. Then the sequence repeats: next cycle, no lights on, followed
3.6 Sequential Logic Circuits
77
78
chapter 3
Digital Logic Structures
Figure 3.31
State diagram for the traffic danger sign controller
0,1
00 01
1, 2 On
11 10
o 1,2,3,4 On
by 1and 2 on, followed by 1, 2, 3, and 4 on, and so forth. Each cycle is to last 1⁄2 second.
Figure 3.31 is a finite state machine that describes the behavior of the traffic danger sign. Note that there are four states, one for each of the four relevant situations. Note the transitions from each state to the next state. I f the switch is on (input = 1), the lights flash in the sequence described. I f the switch is turned off, the state always transfers immediately to the “all off’ state.
Figure 3.32 shows the implementation of a sequential logic circuit that imple- ments the finite state machine of Figure 3.31. Figure 3.32a is a block diagram, similar to Figure 3.23. Note that there is one external input, a switch that deter- mines whether or not the lights should flash. There are three external outputs, one to control when lights 1 and 2 are on, one to control when lights 3 and 4 are on, and one to control when light 5 is on. Note that there are two internal storage elements that are needed to keep track of which state the controller is in, which is determined by the past behavior of the traffic danger sign. Note finally that there is a clock signal that must have a cycle time of 1⁄2second in order for the state transitions to occur every 1⁄2second.
The only relevant history that must be retained is the state that we are transi- tioning from. Since there are only four states, we can uniquely identify them with two bits. Therefore, only two storage clements are needed. Figure 3.31 shows the two-bit code used to identify each of the four states.
Combinational Logic
Figure 3.32b shows the combinational logic circuit required to complete the imple- mentation of the controller for the traffic danger sign. Two sets of outputs of the combinational logic circuit are required for the controller to work properly: a set of external outputs for the lights and a set of internal outputs to determine the inputs to the two storage elements that keep track of the state.
First, let us look at the outputs that control the lights. As we have said, there are only three outputs necessary to control the lights. Light 5 is controlled by the output of the AND gate labeled X, since the only time light 5 is on is if the switch
To
i.——-+- combinational logic circuit
combinational –+-.–{) logic circuit
Clock
Switch
From Element 1
From
Element2
1, 2
3,4
Switch
1
‘- 1—-1Element 1’ •
I.I
1Element2 1
Storage• (a) Block diagram
1, 2
3,4 5
Clock ———————–1 (c) A storage element (a master-slave flip-flop)
Figure 3.32 Sequential logic circuit implementation of Figure 3.30
~ Combinational logic circuit
(b) The combinational logic circuit
3.6
Sequential Logic Circuits
79
Latch A
To Element 1
To Element2
From
80
chapter 3 Digital Logic Structures
is on, and the controller is in state 11. Lights 3 and 4 are controlled by the output of the OR gate labeled Y, since there are two states in which those lights are on, those labeled 10 and 11. Why are lights I and 2 controlled by the output of the OR gate labeled Z? See Exercise 3.42.
Next, let us look at the internal outputs that control the storage clements. Storage element I should be set to I for the next clock cycle if the next state is to be IO or 11. This is true only if the switch is on and the current state is either OI or 10. Therefore the output signal that will make storage element I be I in the next clock cycle is the output of the OR gate labeled W. Why is the next state of storage element 2 controlled by the output of the OR gate labeled U? See Exercise 3.42.
Storage Elements
The last piece of logic needed for the traffic danger sign controller is the logic circuit for the two storage elements shown in figure 3.32a. Why can’t we use the the gated D latch discussed in Section 3.4, one might ask? The reason is as follows: During the current clock cycle the output of the storage element is an internal input to the combinational logic circuit, and the output ofthe combinational logic circuit is an input to the storage element that must not take effect until the start of the next clock cycle. If we used a gated D latch, the input would take effect immediately and overwrite the value in the storage element, instead of waiting for the start of the next cycle.
To prevent that from happening, a simple logic circuit for implementing the storage element is the master-slave flip-flop. A master-slave flip-flop can be constructed out of two gated D latches, as shown in Figure 3.32c. During the first half of the clock cycle, it is not possible to change the value stored in latch A. Thus, whatever is in latch A is passed to latch B, which is an internal input to the combinational logic circuit. During the second half of the clock cycle, it is not possible to change the value stored in latch B, so the value present during the first half of the clock cycle remains in latch B as the input to the combinational logic circuit for the entire cycle. However, during the second half of the clock cycle, it is possible to change the value stored in latch A. Thus the master-slave flip-flop allows the current state to remain intact for the entire cycle, while the next state is produced by the combinational logic to change latch A during the second half of the cycle so as to be ready to change latch B at the start of the next cycle.
3.7 TheDataPathoftheLC-3
In Chapter 5, we will specify a computer, which we call the LC-3, and you will have the opportunity to write computer programs to execute on it. We close out this chapter with Figure 3.33, which shows a block diagram of what we call the data path of the LC-3 and the finite state machine that controls all the LC-3 actions. The data path consists of all the logic structures that combine to process information in the core of the computer. Right now, Figure 3.33 is undoubtedly more than a little intimidating, and you should not be concerned by that. You are not ready to analyze it yet. That will come in Chapter 5. We have included it here,
LD.MDR
16
MDR
MAR
16
LD.MAR
GateMARMUX
GatePC
+l
16
ZEXT [7:01
ADDR2MUX
SR2 SRI 3 OUT OUT
16 16
SRl
2
SEXT
[8:01
SEXT
16 16 16 16
0 16
SEXT
I 10:0J
[5:0]
FINITE 16 STATE
R2 MACHINE,1—~,,:_.._,,.. B
LD.IR
GateMDR
JR
16
16
LD.CC
Figure 3.33
MEM.EN,R.W
The data path of the LC-3 computer
SEXT
MARMUX
16 16
LD.PC
16
+
16
3 REG
MEMORY
INPUT
OUTPUT
[4:0]
16
16
3.7
The Data Path of the LC-3
81
DR
SR2
16
FILE
1 – – – – – – t ; , \ SR2MUX
A UK •
•
A ALU
16
GateALU
82
chapter 3 Digital Logic Structures
Exercises
however, only to show you that you are already familiar with many of the basic structures that make up a computer. That is, you already know how most of the elements in the data path work, and furthennore, you know how those elements are constructed from gates. For example, PC, IR, MAR, and MDR are registers and store 16 bits of information each. Each wire that is labeled with a cross-hatch 16 represents 16 wires, each carrying one bit of information. N, Z, and P are one-bit registers. They could be implemented as master-slave flip-flops. There are five muxes, one supplying a 16-bit value to the PC register, one supplying an
address to the MAR, one selecting one of two sources to the B input of the ALU, and two selecting inputs to a 16-bit adder. In Chapter 5, we will see why these elements must be connected as shown in order to execute the programs written for the LC-3 computer. For now, just enjoy the fact that the components look familiar. In Chapters 4 and 5, we will raise the level of abstraction again and put these components together into a working computer.
3.1
3.2
In the following table, write whether each type of transistor will act as an open circuit or a closed circuit.
n-type p-type
Gate= I Gate= O
Replace the missing parts in the circuit below with either a wire or no wire to give the output OUT a logical value ofOwhen the input IN is a logical 1.
IN=1 {:Jt-OUT=0 –r·
A two-input AND and a two-input OR are both examples of two-input logic functions. How many different two-input logic functions are possible?
3.3
3.4
Replace the missing parts in the circuit below with either a wire or no wire to give the output C a logical value of 1. Describe a set of inputs that give the output C a logical value of 0. Replace the missing parts with wires or no wires corresponding to that set of inputs.
T
A -.—–‘ \ p-type ·1′
8-+—–.—–;/ -.: p-type -~c
~· ~_,’ ~n-type n-type \ ~ : ‘ ‘-i’
Complete a truth table for the transistor-level circuit in Figure 3.34.
3.5
Figure 3.34 Diagram for Exercise 3.5
B~
p–a
~.!::::=;—OUT
~c
Exercises 83
84
chapter 3 Digital Logic Structures
3.6 For the transistor-level circuit in Figure 3.35, fill in the truth table. What is Zin terms of A and B?
ABCDZ
Figure 3.35 Diagram for Exercise 3.6
C
D
3.7 The circuit below has a major flaw. Can you identify it? Hint: Evaluate the circuit for all sets of inputs.
~B
—–~OUT
A~ ~B
——-2
Exercises 85 3.8 The transistor-level circuit below implements the logic equation given
below. Label the inputs to all the transistors.
Y ~ NOT (A AND (B OR C))
f-
3.9 Fill in the truth table for the logical expression NOT(NOT(A) OR NOT(B)). What single logic gate has the same truth table?
A B NOT(NOT(A) OR NOT(B))
00
0I
0
I
3.10 Fill in the truth table for a two-input NOR gate.
A B ANORB 00
0I
I0
I
86
chapter 3
Digital Logic Structures
3.11 a.
b.
Draw a transistor-level diagram for a three-input AND gate and
a three-input OR gate. Do this by extending the designs from
Hgures 3.6a and 3.7a.
Replace the transistors in your diagrams from part a with either a wire or no wire to reflect the circuit’s operation when the following inputs are applied.
(1) A= l, B = 0, C = 0 (2) A=0,B=0,C=0 (3) A=I,B=I,C=1
3.12
3.13 3.14
3.15
Following the example of Figure 3.11 a, draw the gate-level schematic of a three-input decoder. For each output of this decoder, write the input conditions under which that output will be 1.
How many output lines will a five-input decoder have?
How many output lines will a 16-input multiplexer have? How many select lines will this multiplexer have? ·
If A and B are four-bit unsigned binary numbers, 0111 and 1011, complete the table obtained when using a two-bit full adder from Figure 3.15 to calculate each bit of the sum, S, of A and B. Check your answer by adding the decimal value of A and B and comparing the sum with S. Are the answers the same? Why or why not?
Cin 0 A0III B 0II s
Cout
Given the following truth table, generate the gate-level logic circuit, using the implementation algorithm referred to in Section 3.3.4.
ABCz 000I 000 000 0I
000 0I
0I 0
3.16
3.17 a.
Given four inputs, A, B, C, and D and one output, Z, create a truth table for a circuit with at least seven input combinations generating
ls at the output. (How many rows will this truth table have?)
b. Now that you have a truth table, generate the gate-level logic circuit
that implements this truth table. Use the implementation algorithm referred to in Section 3.3.4.
3.18 Implement the following functions using AND, OR, and NOT logic gates. The inputs are A, B, and the output is F.
a. FhasthevalueIonlyifAhasthevalueOandBhasthevalueI. b. FhasthevalueIonlyifAhasthevalueIandBhasthevalue0. c. Use your answers from (a) and (b) to implement a I-bit adder.
The truth table for the I-bit adder is given below.
A B Sum 000 0I
01 110
d. Is it possible to create a 4-bit adder (a circuit that will correctly add two 4-bit quantities) using only four copies of the logic diagram from (c)? If not, what information is missing? Hint: When A = I and B = I, a sum of Ois produced. What information is not dropped?
3.19 Logic circuit I in Figure 3.36 has inputs A, B, C. Logic circuit 2 in Figure 3.37 has inputs A and B. Both logic circuits have an output D. There is a fundamental difference between the behavioral characteristics of these two circuits. What is it? Hint: What happens when the voltage at input A goes from Oto I in both circuits?
A
Figure 3.36 Logic circuit 1 for Figure 3.37 Logic circuit 2 for Exercise 3.19 Exercise 3.19
Exercises 87
I
88
chapter 3 Digital Logic Structures
3.20 Generate the gate-level logic that implements the following truth table. From the gate-level structure, generate a transistor diagram that implements the logic structure. Verify that the transistor
diagram implements the truth table.
ino f(ino, in1) 00I 00
0
3.21 You know a byte is 8 bits. We call a 4-bit quantity a nibble. If a byte-addressable memory has a 14-bit address, how many nibbles of storage are in this memory?
3.22 Implement a 4-to-1 mux using only 2-to-1 muxes making sure to properly connect all of, the terminals. Remember that you will have 4 inputs, 2 control signals, and 1 output. Write out the truth table for this circuit.
3.23 Given the logic circuit in Figure 3.38, fill in the truth table for the output value Z.
Figure 3.38
Diagram for Exercise 3.23
ABCz 000 00 00 0I
00 0
0
z
Exercises 89 A3 B3 C3 A2 B2 C2 A1 B1 C1 AO BO CO
~–+—+—+-~–+–+–+-~—-tf—–t—+-~x
++++
Carry-in Carry-in Carry-in Carry-in
S3 S2 S1 so Figure 3.39 Diagram for Exercise 3.24
3_24 a.
Figure 3.39 shows a logic circuit that appears in many of today’s processors. Each of the boxes is a full-adder circuit. What does the value on the wire X do? That is, what is the difference in the output ofthis circuit ifX =0 versus ifX = I?
b. Construct a logic diagram that implements an adder/subtracter. That is, the logic circuit will compute A + B or A ~ B depending on
the value of X. Hint: Use the logic diagram of Figure 3.39 as a building block.
3.25 Say the speed of a logic structure depends on the largest number of logic gates through which any of the inputs must propagate to reach an output. Assume that a NOT, an AND, and an OR gate all count as one gate delay. For example, the propagation delay for a two-input decoder shown in Figure 3.11 is 2 because some inputs propagate through
two gates.
a. What is the propagation delay for the two-input mux shown in Figure 3.12?
b. What is the propagation delay for the I-bit full adder in Figure 3.15?
c. What is the propagation delay for the 4-bit adder shown in Figure 3.16?
d. What if the 4-bit adder were extended to 32 bits?
90 chapter 3 Digital Logic Structures
3.26 Recall that the adder was built with individual “slices” that produced a sum bit and carryout bit based on the two operand bits A and B and the carryin bit. We called such an element a full adder. Suppose we have a 3-to-8 decoder and two six-input OR gates, as shown below. Can we connect them so that we have a full adder? If so, please do. (Hint: If an input to an OR gate is not needed, we can simply put an input Oon it and it will have no effect on anything. For example, see the figure below.)
A;
Decoder
3.27 For this question, refer to the figure below.
001 010 011 100
01 110 11
} – – – s,
s
a. Describe the output of this logic circuit when the select line S is a logical 0. That is, what is the output Z for each value of A?
b. If the select line S is switched from a logical Oto I , what will the output be?
c. Is this logic circuit a storage element?
o-~
z
3.28
Having designed a binary adder, you are now ready to design a 2-bit by 2-bit unsigned binary multiplier. The multiplier takes two 2-bit inputs
A[l:0] and B[l:0] and produces an output Y which is the product of A[l:0] and B[l:0]. The standard notation for this is:
Y = A[l:O] · B[l:0]
a. What is the maximum value that can be represented in 2 bits for A(A[l:0l)?
b. What is the maximum value that can be represented in 2 bits for B(B[l :OJ)?
c. What is the maximum possible value of Y?
d. What is the number of required bits to represent the maximum value
of Y’!
e. Write a truth table for the multiplier described above. You will have a
four-input truth table with the inputs being A[ I], A[0], B[ 1], and
B[0].
f Implement the third bit of output, Y[2] from the truth table using only
AND, OR, and NOT gates.
A 16-bit register contains a value. The value x75A2 is written into it. Can the original value be recovered?
3.29
Exercises 91
92
chapter 3 Digital Logic Structures
3.30 A comparator circuit has two 1-bit inputs A and B and three 1-bit outputs G (greater), E (Equal), and L (less than). Refer to Figures 3.40 and 3.41 for this problem.
Figure 3.40
A[3]—i 8[3]—i
A[2]–
8[2]–
A[1]–
8[1]–
A[O]—-><
8[0]--*<
Figure 3.41
EQUAL
a.
b. c.
Draw the truth table for a I-bit comparator.
ARGEL 00
0l
I0
Il
G is I if A > B Eis I if A = B 0 otherwise 0 otherwise
Lis I if A < B 0 otherwise
A --..-1
8--....i
Diagram for Exercise 3.30
G
E
L
G
E
L
G
E
L
G
E
L
Diagram for Exercise 3.30
Gf----- E>— L>—
Implement G, E, and L using AND, OR, and NOT gates.
Using the I-bit comparator as a basic building block, construct a
four-bit equality checker, such that output EQUAL is I if A[3:0J = B[3:0], 0 otherwise.
Exercises 93
3.31 If a computer has eight-byte addressability and needs three bits to access
a location in memory, what is the total size of memory in bytes?
3.32 Distinguish between a memory address and the memory’s addressability.
3.33 Using Figure 3.21, the diagram of the 4-entry, 22-by-3-bit memory.
a. To read from the fourth memory location, what must the values of A[I :0] and WE be?
b. To change the number of entries in the memory from 4 to 60, how many address lines would be needed? What would the addressability of the memory be after this change was made?
c. Suppose the minimum width (in bits) of the program counter (the program counter is a special register within a CPU, and we will discuss it in detail in. the next chapter) is the minimum number of bits needed to address all 60 locations in our memory from part (b). How many additional memory locations could be added to this memory without having to alter the width of the program counter?
94
chapter 3 Digital Logic Structures
WE
Oi[3]
Oi(2]
0 -~0 ~~~’OJ – [~
-r Oi[1]
0-1-0 r
Oi[O] Figure 3.42
A[1] A[O]
3.34 For the memory shown in Figure 3.42:
a. What is the address space?
b. What is the addressability?
c. What is the data at address 2?
I
I~ (j [~
-~
–
_,.. r
~~
1
r -~-~
–
,..f ,’
r
– ~-
f””l->—1 I
rj6u J
II ‘
D[3]
0(2]
r? 1’ – 10
J°L1= -1 –1 -1
Diagram for Exercise 3.34
rr
–
0(1]
0(0]
Exercises 95
3.35 Given a memory that is addressed by 22 bits and is 3-bit addressable,
how many bits of storage does the memory contain?
3.36 A combinational logic circuit has two inputs. The values of those two inputs during the past ten cycles were 01, 10, 11, 01, 10, 11, 01, 10, 11, and O1. The values of these two inputs during the current cycle are l0. Explain the effect on the current output due to the values of the inputs during the previous ten cycles.
3.37 In the case.of the lock of Figure 3.24a, there are four states A, B, C, and D, as described in Section 3.6.2. Either the lock is open (State D), or if it is not open, we have already performed either zero (State A), one
(State B), or two (State C) correct operations. This is the sum total of all possible states that can exist. Exercise: Why is that the case? That is, what would be the snapshot of a fifth state that describes a possible situation for the combination lock?
3.38 Recall Section 3.6.2. Can one have an arc from a state where the score is Texas 30, Oklahoma 28 to a state where the score is tied, Texas 30, Oklahoma 30? Draw an example of the scoreboards (like the one in Figure 3.25) for the two states.
3.39 Recall again Section 3.6.2. Is it possible to have two states, one where Texas is ahead 30-28 and the other where the score is tied 30-30, but no arc between the two? Draw an example of two scoreboards, one where the score is 30-28 and the other where the score is 30-30, but there can be no arc between the two. For each of the three output values, game in progress, Texas wins, Oklahoma wins, draw an example of a scoreboard that corresponds to a state that would produce that output.
3.40 Refer to Section 3.6.2. Draw a partial finite state machine for the game of tic-tac-toe.
3.41 The IEEE campus society office sells sodas for 35 cents. Suppose they install a soda controller that only takes the following three inputs: nickel, dime, and quarter. After you put in each.i,;oin, you push a pushbutton to register the coin. If at least 35 cents has been put in the controller, it will output a soda and proper change (it applicable). Draw a finite state machine that describes the behavior of the soda controller. Each state will represent how much money has been put in (Hint: There will be seven of these states). Once enough money has been put in, the controller will go to a final state where the person will receive a soda and proper change (Hint: There are five such final states). From the final state, the next coin that is put in will start the process again.
3.42 Refer to Figure 3.32b. Why are lights 1 and 2 controlled by the output of the OR gate labeled Z? Why is the next state of storage element 2 controlled by the output of the OR gate labeled U?
96
chapter 3 Digital Logic Structures
3.43 Shown in Figure 3.43 is an implementation of a finite state machine with an input X and output Z.
a. Complete the rest of the following table. S1, SO specifies the present state.
DI, DO specifies the next state.
Figure 3.43
Diagram for Exercise 3.43
X
so DO S1 D1
Clock
SI so XIDl 00 z 000
001
00 0I0I
00 0I I0
b. Draw the state diagram for the truth table from part a.
3.44 Prove that the NANO gate, by itself, is logically complete (see Section 3.3.5) by constructing a logic circuit that performs the AND function, a logic circuit that performs the NOT function, and a logic circuit that performs the OR function. Use only NANO gates in these three logic circuits.
Ch~pte
4 The van Neumann Model
We arc now ready to raise our level of abstraction another notch. We will build on the logic structures that we studied in Chapter 3, both decision elements and storage elements, to construct the basic computer model first proposed by John von Neumann in 1946.
4.l BasicComponents
To get a task done by a computer, we need two things: a computer program that specifies what the computer must to do to complete the task, and the computer itself that is to carry out the task.
A computer program consists of a set of instructions, each specifying a well- defined piece of work for the computer to carry out. The instruction is the smallest piece of work specified in a computer program. That is, the computer either carries out the work specified by an instruction or it does not. The computer does not have the luxury of carrying out a piece of an instruction.
John von Neumann proposed a fundamental model of a computer for process- ing computer programs in 1946. Figure 4.1 shows its basic components. We have taken a little poetic license and added a few of our own minor embellishments to von Neumann’s original diagram. The von Neumann model consists of five parts: memory, a processing unit, input, output, and a control unit. The computer program is contained in the computer’s memory. The control of the order in which the instructions are carried out is performed by the control unit.
We will describe each of the five parts of the von Neumann model.
98
chapter 4
The von Neumann ModeI
Figure 4.1
PC
The van Neumann model, overall block diagram
INPUT
—
OUTPUT
• Monitor * Printer • LED
* Disk
• Keyboard
* Mouse
* Scanner
• Card reader * Disk
PROCESSING UNIT
4.1.1 Memory
A
I
I I
~ITEMP I !,\ I
‘->
MOR I
IMAR
11
MEMORY
fl.
CONTROL UNIT
IR
I I
Recall that in Chapter 3 we examined a simple 22-by-3-bit memory that was con-
structed out of gates and latches. A more realistic memory for one of today’s
computer systems is 228 by 8 bits. That is, a typical memory in today’s world
of computers consists of 228 distinct memory locations, each of which is capa-
ble of storing 8 bits of information. We say that such a memory has an address
28
space of 2
refer to such a memory as a 256-megabyte memory (abbreviated, 256 MB). The “256 mega” refers to the 228 locations, and the “byte” refers to the 8 bits stored in each location. The term byte is, by definition, the word used to describe 8 bits, much the way gallon describes four quarts.
We note (as we will note again and again) that with k bits, we can represent uniquely 2k items. Thus, to uniquely identify 228 memory locations, each loca- tion must have its own 28-bit address. In Chapter 5, we will begin the complete definition of the instruction set architecture (ISA) of the LC-3 computer. We will see that the memory address space of the LC-3 is 216, and the addressability is 16 bits.
Recall from Chapter 3 that we access memory by providing the address from which we wish to read, or to which we wish to write. To read the contents ofa mem- ory location, we first place the address of that location in the memory’s address register (MAR), and then interrogate the computer’s memory. The information
uniquely identifiable locations, and an addressability of 8 bits. We
Figure 4.2
000
001
010
011
100 00000110 101
110 00000100 111
Location 6 contains the value 4; location 4 contains the value 6
stored in the location having that address will be placed in the memory’s data register (MDR). To write (or store) a value in a memory location, we first write the address of the memory location in the MAR, and the value to be stored in the MDR. We then interrogate the computer’s memory with the Write Enable signal asserted. The information contained in the MDR will be written into the memory location whose address is in the MAR.
Before we leave the notion of memory for the moment, let us again emphasize the two characteristics of a memory location: its address and what is stored there. Figure 4.2 shows a representation of a memory consisting of eight locations. Its addresses are shown at the left, numbered in binary from Oto 7. Each location contains 8 bits of information. Note that the value 6 is stored in the memory location whose address is 4, and the value 4 is stored in the memory location whose address is 6. These represent two very different situations.
Finally, an analogy comes to mind: the post office boxes in your local post office. The box number is like the memory location’s address. Each box number is unique. The information stored in the memory location is like the letters contained in the post office box. As time goes by, what is contained in the post office box at any particular moment can change. But the box number remains the same. So, too, with each memory location. The value stored in that location can be changed, but the location’s memory address remains unchanged.
4.1.2 Processing Unit
The actual processing of information in the computer is carried out by the processing unit. The processing unit in a modern computer can consist of many sophisticated complex functional units, each performing one particular operation (divide, square root, etc.). The simplest processing unit, and the one normally thought of when discussing the basic von Neumann model, is the ALU. ALU is the abbreviation for Arithmetic and Logic Unit, so called because it is usually capable of performing basic arithmetic functions (like ADD and SUBTRACT) and basic logic operations (like bit-wise AND, OR, and NOT that we have already studied in Chapter 2). As we will see in Chapter 5, the LC-3 has an ALU, which can perform ADD, AND, and NOT operations.
The size of the quantities normally processed by the ALU is often referred to as the word length of the computer, and each element is referred to as a word. In
4.1 Basic Components 99
100
chapter 4 The von Neumann Model
the LC-3, the ALU processes 16-bitquantities. We say the LC-3 has a word length of 16 bits. Each ISA has its own word length, depending on the intended use of the computer. Most microprocessors today that are used in PCs or workstations have a word length of either 32 bits (as is the case with Intel’s Pentium IV) or 64 bits (as is the case with Sun’s SPARC-V9 processors and Intel’s Ttanium processor). For some applications, like the microprocessors used in pagers, VCRs, and cellular telephones, 8 bits are usually enough. Such microprocessors, we say, have a word length of 8 bits.
It is almost always the case that a computer provides some small amount of storage very close to the ALU to allow results to be temporarily stored if they will be needed to produce additional results in the near future. For example, if a computer is to calculate (A +B) · C, it could store the result of A +B in memory, and then subsequently read it in order to multiply that result by C. However, the time it takes to access memory is long compared to the time it takes to perform the
ADD or MULTIPLY. Almost all computers, therefore, have temporary storage for storing the result of A + B in order to avoid the unnecessarily longer access time that would be necessary when it came time to multiply. The most common form of temporary storage is a set of registers, like the register described in Section 3.4.3. Typically, the size of each register is identical to the size of values processed by the ALU, that is, they each contain one word. The LC-3 has eight registers (RO, RI, … R7), each containing 16 bits. The SPARC-V9 ISA has 32 registers (RO, RI, … R31), each containing 64 bits.
4.1.3 Input and Output
In order for a computer to process infom1ation, the information must get into the computer. In order to use the results of that processing, those results must be displayed in some fashion outside the computer. Many devices exist for the purposes of input and output. They are generically referred to in computer jar- gon as peripherals because they are in some sense accessories to the processing function. Nonetheless, they are no less important.
In the LC-3 we will have the two most basic of input and output devices. For input, we will use the keyboard; for output, we will use the monitor.
There are, ofcourse, many other input and output devices in computer systems today. For input we have among other things the mouse, digital scanners, and Iloppy disks. For output we have among other things printers, LED displays, and disks. In the old days, much input and output was carried out by punched cards. Fortunately, for those who would have to lug boxes of cards around, the use of punched cards has largely disappeared.
4.1.4 Control Unit
The control unit is like the conductor of an orchestra; it is in charge of making all the other parts play together. As we will see when we describe the step-by-step process of executing a computer program, it is the control unit that keeps track of both where we are within the process of executing the program and where we arc in the process of executing each instruction.
4.2 The LC-3: An Example von Neumann Machine 101
To keep track of which instruction is being executed, the control unit has an instruction register to contain that instruction. To keep track of which instruction is to be processed next, the control unit has a register that contains the next instruction’s address. For historical reasons, that register is called the program counter (abbreviated PC), although a better name for it would be the instruction pointer, since the contents of this register are, in some sense, “pointing” to the next instruction to be processed. Curiously, Intel docs in fact call that register the instruction pointer, but the simple elegance of that name has not caught on.
4.2 The~c-3:RnExamplevanNeumannMachine
In Chapter 5, we will introduce in detail the LC-3, a simple computer that we will study extensively. We have already shown you its data path in Chapter 3 (Figure 3.33) and identified several of its structures in Section 4.1. In this sec- tion, we will pull together all the parts of the LC-3 we need to describe it as a von Neumann computer (see Figure 4.3). We constructed Figure 4.3 by start- ing with the LC-3’s full data path (Figure 3.33) and removing all elements that are not essential to pointing out the five basic components of the von Neumann model.
Note that there arc two kinds of arrowheads in Figure 4.3: filled-in and not-filled-in. Filled-in arrowheads denote data elements that flow along the cor- responding paths. Not-filled-in arrowheads denote control signals that control the processing of the data elements. For example, the box labeled ALU in the pro- cessing unit processes two 16-bit values and produces a 16-bit result. The two sources and the result are all data, and are designated by filled-in arrowheads. The operation performed on those two 16-bit data elements (it is labeled ALUK) is part of the control-therefore, a not-filled-in arrowhead.
MEMORY consists of the storage elements, along with the MAR for addressing individual locations and the MDR for holding the contents of a memory location on its way to/from the storage. Note that the MAR contains 16 bits, reflecting the fact that the memory address space of the LC-3 is 216 memory locations. The MDR contains 16 bits, reflecting the fact that each memory location contains 16 bits-that is, that the LC-3 is 16-bit addressable.
INPUT/OUTPUT consists of a keyboard and a monitor. The simplest keyboard requires two registers, a data register (KBDR) for holding the ASCII codes of keys struck, and a status register (KBSR) for maintaining status information about the keys struck. The simplest monitor also requires two registers, one (DDR) for holding the ASCII code of something to be displayed on the screen, and one (DSR) for maintaining associated status information. These input and output registers will be discussed in more detail in Chapter 8.
THE PROCESSING UNIT consists of a functional unit that can perform arithmetic and logic operations (ALU) and eight registers (RO, … R7) for storing temporary values that will be needed in the near future as operands
102 chapter 4 ~
~
The von Neumann Model
PROCESSOR BUS
_,,,,
~
GateMDR- ,..i.
16 .16
LD.MDR
Figure 4.3
MOR
~LOMAR
•
LO.PC PC
2
t CLK _.,
I
+1
16, /
.• .
~ GatePC
… ~ 6
3 REG
PCMux\
FILE
L~s
—<>
A-+ FINITE STATE
1.,
16
SR1
OUT i<,,.'LsR
I,
, 16
·/
LO.IA MACHINE f-i> f-i>
I IR
I, . 2, \ B
t6
,16 ALU
MEM.!N, R.W
“16 ..
CONTROL UNIT
~L
The LC-3 as an example of the von Neumann model
~
OUTPUT
MEMORY
INPUT
6
IKBOR I
for subsequent instructions. The LC-3 ALU can perform one arithmetic operation (addition) and two logical operations (bitwise AND and bitwise complement).
THE CONTROL UNIT consists of all the structures needed to manage the processing that is carried out by the computer. Its most important structure is the finite state machine, which directs all the activity. Recall the finite state machines in Section 3.6. Processing is carried out step by step, or rather, clock cycle by clock cycle. Note the CLK input to the finite state machine in Figure 4.3. It specifies how long each clock cycle lasts. The
.
H> f–
ALUK
PROCESSING UNIT
GateALU
-~
v
, 16
OR__,,4,
LD.REG-t::
SR2 SR2,,ZC OUT
,,1a
-\SR2MUX/
I
KBSR
I
instruction register (IR) is also an input to the finite state machine since what LC-3 instruction is being processed determines what activities must be carried out. The program counter (PC) is also a part of the control unit; it keeps track of the next instruction to be executed after the current instruction finishes.
Note that all the external outputs of the finite state machine in Figure 4.3 have arrowheads that are not filled in. These outputs control the processing throughout the computer. For example, one of these outputs (two bits) is ALUK, which controls the operation performed in the ALU (add, and, or not) during the current clock cycle. Another output is GateALU, which determines whether or not the output of the ALU is provided to the processor bus during the current clock cycle.
The complete description of the data path, control, and finite state machine for one implementation of the LC-3 is the subject of Appendix C.
4.3 InstructionProcessing
The central idea in the von Neumann model of computer processing is that the program and data are both stored as sequences of bits in the computer’s memory, and the program is executed one instruction at a time under the direction of the control unit.
4.3.1 The Instruction
The most basic unit of computer processing is the instruction. It is made up of two parts, the opcode (what the instruction does) and the operands (who it is to do it to). In Chapter 5, we will see that each LC-3 instruction consists of 16 bits (one word), numbered from left to right, bit [15J to bit [OJ. Bits [15: 12J contain the opcode. This means there are at most 24 distinct opcodes. Bits [11:OJ are used to figure out where the operands are.
. ‘.f~eAOOJrt$tru4l91i, ~f\DDinstructii>titeqlli~~ o ~ z ~~~ <-.operati~~(the.datath~•is.t~-.lle_a
.,, —-;,:~
_·:rne.·LPRClnstr~tiqit,•.••~Ll)Rillst~~~~o~i\4$1~,~f#tt Io<11t}l>;Wch~i;~ut~tflt1’~1<1a~g11~m~r!~l,t()n>r~tlle’.\’Jll~Jh,1H$, cont11h1ed iliertl, anif $tl)xe,it~ Pile o~tlleregjste~:· ~ e J w o o p ! ) ~ ~ ~ l l \ ; i r ~ ~ ; .
/lre the valup to be .reru:I from. lllemory illld the ile~tiilalion register, ‘(1/~icb ‘(Vil! co9tai11 ..•
that valueafterthe instructionispropess,tld..The-.llin L.01.tidentm~.~mecharusmtb~t{ willbe u ~ to calc1llate theaddn:ss o)’tbe lltelllOl)’loc~on, to tetead. Tlllttmecb;μiis~c-
iscalledtbeaddressintmo
First state after
DECODE for LDR instruction
First state after DECODE for JMP instruction
PC <- Register
t To state 1
FETCH
DECODE
______ State1 MAR<-PC
PC <-PC+ 1
_ _ _L__~State 2 MOR <- M[MAR]
_ _ _L__~State 3 IR<-MDR
--~~--State4
• • •
•
•
•
.• •
•
Last state
to carry out ADD instruction
.•
State 63
• • •
Last state
to carry out
LOR instruction
i
• • •
To state 1
Figure 4.4 An abbreviated state diagram of the LC-3
To state 1
the IR to be latched at the end of the clock cycle, concluding the FETCH phase of the instruction.
The DECODE phase takes one cycle. In state 4, using the external input IR, and in particular the opcode bits of the instruction, the finite state machine can go to the appropriate next state for processing instructions depending on the particular opcode in IR[l5: 12]. Processing continues cycle by cycle until the instruction completes execution, and the next state logic returns the finite state machine to state I.
As we mentioned earlier in this section, it is sometimes necessary not to execute the next sequential instruction but rather to jump lo another location to find the next instruction tu execute. As we have said, instructions that change the flow of instruction processing in this way are called control instructions. This can be done very easily by loading the PC during the EXECUTE phase of the control instruction, as in state 63 of Figure 4.4, for example.
4.4
Changing the Sequence of Execution 109
110
chapter 4 The von Neumann Model
Appendix C contains a full description of the implementation of the LC-3, including its full state diagram and data path. We will not go into that level of detail in this chapter. Our objective here is to show you that there is nothing magic about the processing of the instruction cycle, and that a properly completed state diagram would be able to control, clock cycle by clock cycle, all the steps required to execute all the pha5es of every instruction cycle. Since each instruction cycle ends by returning to state 1, the finite state machine can process, cycle by cycle, a complete computer program.
4.S StoppingtheComputer
From everything we have said, it appears that the computer will continue processing instructions, carrying out the instruction cycle again and again, ad nauseum. Since the computer does not have the capacity to be bored, must this continue until someone pulls the plug and disconnects power to the computer?
Usually, user programs execute under the control of an operating system. UNIX, DOS, MacOS, and Windows NT are all examples of operating systems. Operating systems are just computer programs themselves. So as far as the com- puter is concerned, the instruction cycle continues whether a user program is being processed or the operating system is being processed. This is fine as far as user programs are concerned since each user program terminates with a control instruc- tion that changes the PC to again start processing the operating system-often to initiate the execution of another user program.
But what if we actually want to stop this potentially infinite sequence of instruction cycles? Recall our analogy to the conductor's baton, beating at the rate of millions of machine cycles per second. Stopping the instruction sequencing requires stopping the conductor's baton. We have pointed out many times that there is, inside the computer, a component that corresponds very closely to the conductor's baton. It is called the clock, and it defines the machine cycle. It enables the finite state machine to continue on to the next machine cycle, whether that machine cycle is the next step of the current phase or the first step of the next phase of the instruction cycle. Stopping the instruction cycle requires stopping the clock.
Figure 4.5a shows a block diagram of the clock circuit, consisting primarily of a clock generator and a RUN latch. The clock generator is a crystal oscillator, a piezoelectric device that you may have studied in your physics or chemistry class. For our purposes, the crystal oscillator is a black box (recall our definition of black
Clock
generator
Figure 4.5
(a)
(b)
Run
The clock circuit and its control
Clock
O
volts~ ~-~,-~,-----·
One- , ,--- Time machine
cycle
2.9 volts\
box in Section 1.4) that produces the oscillating voltage shown in Figure 4.5b. Note the resemblance of that voltage to the conductor's baton. Every machine cycle, the voltage rises to 2.9 volts and then drops back to Ovolts.
If the RUN latch is in the 1 state (i.e., Q = 1), the output of the clock circuit is the same as the output of the clock generator. If the RUN latch is in the Ostate (i.e., Q = 0), the output of the clock circuit is 0.
Thus, stopping the instruction cycle requires only clearing the RUN latch. Every computer has some mechanism for doing that. In some older machines, it is done by executing a HALT instruction. In the LC-3, as in many other machines, it is done under control of the operating system, as we will see in Chapter 9.
Question: If a HALT instruction can clear the RUN latch, thereby stopping the instruction cycle, what instruction is needed to set the RUN latch, thereby reinitiating the instruction cycle?
4.1 Name the five components of the von Neumann model. For each component, state its purpose.
4.2 Briefly describe the interface between the memory and the processing unit. That is, describe the method by which the memory and the processing unit communicate.
4.3 What is misleading about the name program counter? Why is the name instruction pointer more insightful?
4.4 What is the word length of a computer? How does the word length of a computer affect what the computer is able to compute? That is, is it a valid argument, in light of what you learned in Chapter 1, to say that a computer with a larger word size can process more information and therefore is capable of computing more than a computer with a smaller word size?
4.5 The following table represents a small memory. Refer to this table for the following questions.
•
Exercises
Address 0000 0001 0010 0011 0100 0101 0110
0111
Data 0001111001000011 111100000010 0101 0110 1111 0000 0001 0000 0000 0000 0000 0000 0000 01100101 0000 0000 0000 0110 1111111011010011
0000 0110 1101 100 I
Exercises
111
112
chapter 4
The van Neumann Model
4.6
4.7
4.8
What are the two components of an instruction? What information do these two components contain?
Suppose a 32-bit instruction takes the following format:
IOPCODE ISR IDR IIMM I
If there are 60 opcodes and 32 registers, what is the range of values that can be represented by the immediate (IMM)? Assume IMM is a 2's complement value.
Suppose a 32-bit instruction takes the following format:
IOPCODE IDR ISRI ISR2 IUNUSED I If there are 225 opcodes and 120 registers,
a. What is the minimum number of bits required to represent the OPCODE?
b. What is the minimum number of bits required to represent the Destination Register (DR)?
c. What is maximum number of UNUSED bits in the instruction encoding?
The FETCH phase of the instruction cycle does two important things. One is that it loads the instruction to be processed next into the IR. What is the other important thing?
Examples 4.1, 4.2, and 4.5 illustrate the processing of the ADD, LDR, and JMP instructions. The PC, IR, MAR, and MDR arc written in various phases of the instruction cycle, depending on the opcode of the particular instruction. In each location in the table below, enter the opcodes which
4.9
4.10
a. b.
What binary value does location 3 contain? Location 6?
The binary value within each location can be interpreted in many ways. We have seen that binary values can represent unsigned numbers, 2's complement signed numbers, floating point numbers, and so forth.
(1) Interpret location Oand location 1 as 2's complement integers.
(2) Interpret location 4 as an ASCII value.
(3) Interpret locations 6 and 7 as an IEEE floating point number.
Location 6 contains number [15:0]. Location 7 contains number
[31:16].
(4) Interpret location Oand location I as unsigned integers.
In the von Neumann model, the contents of a memory location can also be an instruction. If the binary pattern in location Owere interpreted as an instruction, what instruction would it represent?
A binary value can also be interpreted as a memory address. Say the value stored in location 5 is a memory address. To which location does it refer? Whal binary value does that location contain?
c.
d.
write to the corresponding register (row) during the corresponding phase (column) of the instruction cycle.
PC IR MAR MOR
Fetch
Instruction
Decode
Evaluate Fetch Address Data
Execute
Store Result
4.11 Stale the phases of the instruction cycle and briefly describe what operations occur in each phase.
4.12 For the instructions ADD, LDR, and JMP, write the operations that occur in each phase of the instruction cycle.
4.13 Say it takes 100 cycles to read from or write to memory and only one cycle to read from or write to a register. Calculate the number of cycles it takes for each phase of the instruction cycle for both the IA-32 instruction "ADD [eax], edx" (refer to Example 4.3) and the LC-3 instruction "ADD R6, R2, R6." Assume each phase (if required) takes one cycle, unless a memory access is required.
4.14 Describe the execution of the JMP instruction if R3 contains x369C (refer to Example 4.5).
4.15 If a HALT instruction can clear the RUN latch, thereby stopping the instruction cycle, what instruction is needed to set the RUN latch, thereby reinitiating the instruction cycle?
4.16 a. If a machine cycle is 2 nanoseconds (i.e., 2 · 1o-9 seconds), how many machine cycles occur each second?
b. Ifthe computer requires on the average eight cycles to process each instruction, and the computer processes instructions one at a time from beginning to end, how many instructions can the computer process in 1 second?
c. Preview of future courses: In today's microprocessors, many features are added to increase the number of instructions processed each second. One such feature is the computer's equivalent of an assembly line. Each phase of the instruction cycle is implemented as one or more separate pieces of logic. Each step in the processing of an instruction picks up where the previous step left off in the previous machine cycle. Using this feature, an instruction can be fetched
from memory every machine cycle and handed off at the end of the machine cycle to the decoder, which performs the decoding function during the next machine cycle while the next instruction is being fetched. Ergo, the assembly line. Assuming instructions are located at
Exercises 113
114 chapter 4
The von Neumann Model
sequential addresses in memory, and nothing breaks the sequential flow, how many instructions can the microprocessor execute each second if the assembly line is present? (The assembly line is called a pipeline, which you will encounter in your advanced courses. There are many reasons why the assembly 1ine cannot operate at its maximum rate, a topic you will consider at length in some of
these courses.)
The LC-3
In Chapter 4, we discussed the basic components of a computer-its memory, its processing unit, including the associated temporary storage (usually a set of reg- isters), input and output devices, and the control unit that directs the activity of all the units (including itself!). We also studied the six phases of the instruction cycle-FETCH, DECODE, ADDRESS EVALUATION, OPERAND FETCH, EXECUTE, and STORE RESULT. We are now ready to introduce a "real" com- puter, the LC-3. To be more nearly exact, we are ready to introduce the instruction set architecture (ISA) of the LC-3. We have already teased you with a few facts about the LC-3 and a few of its instructions. Now we will examine the ISA of the LC-3 in a more comprehensive way.
Recall from Chapter I that the ISA is the interface between what the software commands and what the hardware actually carries out. In this chapter and in Chapters 8 and 9, we will point out the important features o f the ISA o f the LC-3. You will need these features to write programs in the LC-3's own language, that is, in the LC-3's machine language.
A complete description of the ISA of the LC-3 is contained in Appendix A.
5.1 The ISA: Overview
The ISA specifies all the information about the computer that the software has to be aware of. In other words, the ISA specifies everything in the computer that is available to a programmer when he/she writes programs in the computer's own machine language. Thus, the ISA also specifies everything in the computer that
chaple
5
116
chapter 5 The LC-3
is available to someone who wishes to translate programs written in a high-level language like C or Pascal or Fortran or COBOL into the machine language 0£ the computer.
The ISA specifies the memory organization, register set, and instruction set, including opcodes, data types, and addressing modes.
5.1.1 Memory Organization
The LC-3 memory has an address space of 216 (i.e., 65,536) locations, and an addressability of 16 bits. Not all 65,536 addresses are actually used for memory locations, but we will leave that discussion for Chapter 8. Since the normal unit of data that is processed in the LC-3 is 16 bits, we refer to 16 bits as one word, and we say the LC-3 is word-addressable.
5.1.2 Registers
Since it usually takes far more than one machine cycle to obtain data from mem- ory, the LC-3 provides (like almost all computers) additional temporary storage locations that can be accessed in a single machine cycle.
The most common type of temporary storage locations and the one used in the LC-3 is the general purpose register set. Each register in the set is called a general purpose register (GPR). Registers have the same property as memory locations in that they are used to store information that can be retrieved later. The number of bits stored in each register is usually one word. In the LC-3, this means 16 bits.
Registers must be uniquely identifiable. The LC-3 specifies eight GPRs, each identified by a 3-bit register number. They are referred to as RO, RI, ... R7. Figure 5.1 shows a snapshot of the LC-3's register set, sometimes called a register
file, with the eight values I, 3, 5, 7, -2, -4, -6, and -8 stored in RO, ... R7, respectively.
Figure 5.1
Register 7
The register file before the
Register o Register 1 Register 2 Register 3 Register 4 Register 5 Register 6
(RO) 0000000000000001 (R1) 0000000000000011 (R2) 0000000000000101 (R3) 0000000000000111 (R4) 1111111111111110 (R5) 1111111111111100 (R6) 1111111111111010 (R7) 1111111111111000
ADD instruction
Figure 5.2
The register file after the ADD instruction
Register 0 (RO) Register I (RI) Register 2 (R2) Register 3 (R3) Register 4 (R4) Register 5 (R5) Register 6 (R6) Register 7 (R7)
OOOOOOO(XJOOOO(Kl I 0000000000000011 OOOOOOOO(){)()(J()lOO 0000000000000I 11 1111111111111110 1111111111111100 l 111111111111010 11111 I 1111111()()()
Recall that the instruction to ADD the contents of RO to Rl and store the result in R2 is specified as
15 14 13 12 11 10 9 8 7 6 5 4 3 2 I 0
I000II0 000000 A D D R2 RO
000I RI
where the two sources o[ the ADD instruction are specified in bits [8:6] and bits [2:0]. The destination of the ADD result is specified in bits 111:9]. Figure 5.2 shows the contents of the register file of Figure 5.1 AFfER the instruction ADD R2, RI, RO is executed.
5.1.3 The Instruction Set
An instruction is made up of two things, its opcode (what the instruction is asking the computer to do) and its operands (who the computer is expected to do it to). The instruction set of an ISA is defined by its set of opcodes, data types, and addressing modes. The addressing modes determine where the operands are located.
You have just seen an example of one opcode ADD and one addressing mode register mode. The operation the instruction is asking the computer to perform is 2's complement integer addition, and the locations where the computer is expected to find the operands are the general purpose registers.
5.1.4 Opcodes
Some ISAs have a very large set of opcodes, one for each of a large number of tasks that a program may wish to carry out. Other ISAs have a very small set of opcodes. Some ISAs have specific opcodes to help with processing scientific calculations. For example, the Hewlett Packard Precision Architecture has an instruction that performs a multiply, followed by an add (A - B) +C on three source operands. Other IS As have instructions that process video images obtained from the World Wide Web. The Intel x86 ISA added a number of instructions Intel calls MMX
5.1 The ISA: Overview 117
118
chapter 5 The LC-3
instructions because they eXtend the ISA to assist with MultiMedia applications that use the Web. Still other ISAs have specific opcodes to help with handling the tasks of the operating system. For example, the VAX architecture, popular in the 1980s, had an opcode to save all the information associated with one program that was running prior to switching to another program. Almost all computers prefer to use a long sequence of instructions to ask the computer to carry out the task of saving all that information. Although that sounds counterintuitive, there is a rationale for it. Unfortunately, the topic will have to wait for a later semester. The decision as to which instructions to include or leave out of an ISA is usually a hotly debated topic in a company when a new ISA is being specified.
The LC-3 ISA has 15 instructions, each identified by its unique opcode. The opcode is specified by bits [15:12] of the instruction. Since four bits are used to specify the opcode, 16 distinct opcodes are possible. However, the LC-3 ISA specifics only 15 opcodes. The code 110I has been left unspecified, reserved for some future need that we are not able to anticipate today.
There are three different types of instructions, which means three different types of opcodes: operates, data movement, and control. Operate instructions process information. Data movement instructions move information between memory and the registers and between registers/memory and input/output devices. Control instructions change the sequence of instructions that will be executed. That is, they enable the execution of an instruction other than the one that is stored in the next sequential location in memory.
Figure 5.3 lists all the instructions of the LC-3, the bit encoding [15: 12] for each opcode, and the format of each instruction. The use of these formats will be further explained in Sections 5.2, 5.3, and 5.4.
5.1.5 Data Types
A data type is a representation of information such that the ISA has opcodes that operate on that representation. There are many ways to represent the same information in a computer. That should not surprise us. In our daily lives, we regularly represent the same information in many different ways. For example, a child, when asked how old he is, might hold up three fingers, signifying he is 3 years old. If the child is particularly precocious, he might write the decimal digit 3 to indicate his age. Or, if he is a CS or CE major at the university, he might write 0000000000000011, the 16-bit binary representation for 3. If he is a chemistry major, he might write 3.0 - 10°. All four represent the same entity: 3.
If the ISA has an opcode that operates on information represented by a data type, then we say the ISA supports that data type. In Chapter 2, we introduced the only data type supported by the ISA of the LC-3: 2's complement integers.
5.1.6 Addressing Modes
An addressing mode is a mechanism for specifying where the operand is located. An operand can generally be found in one of three places: in memory, in a register, or as a part of the instruction. If the operand is a part of the instruction, we refer to it as a literal or as an immediate operand. The term literal comes from the
5.1 The ISA: Overview 119 76543210
I
000 I:s~I I
1'
I i~m~
0 ~I :SR~ I
1 : i~m~ : I
I JSR :01:00: I1I : : : :PC~ffs~t11: : : : I
JSRR :01:00:IOI~~as~RI::oo~oo~:I
ADD+ :o~o< ADD+ :o~o<
:DR: I :sRi
:DR: I :sRi
:I
AND+ AND+ BR JMP
:01:01: :DR: I :sRi
:01:01: :DR: I :sRi I
LD+ LDI+ LDW LEA+ NOP RET RTI ST STI
I'I
15 14 13 12 11 10 9 8
:0~00: lnl zlpl I :
: 11:00: I :ooo: I ~s~R I : :oo~oo~
I
: 0~10: : 1~10: : 01:10: : 11:10: : 1~01 : : 11:00: : 1~00: :0~11: : 1+1 : : 01:11: : 11:11:
Figure 5.3 Formats of the entire LC-3 instruction set. NOTE:+ indicates instructions that modify condition codes
STR
TRAP
reserved :11:01:
I ~as~R I : :ott~et6: : I II
I :DR:
:DR:
:DR:
:DR:
:DR:
:ooo:
I
:I I :SR:
I I
:
I
I
:SR1 I :SR1
I I
: :P~off~et9: :
I
I
I
:P~off~et9: : : I I
: P~ott~et9:
: :' :P~ott~et9: : : I
~as~R I : :off~et6: : I I:::P~off~et9:::I I:SR:I::111:111::I I:111:I::oo~oo~:I : :oo~oo~oo~oo~ : : : I
I : : :P~ott~et9: : : I II
I :o~oo'. I II
I
: ,:rap~ect~ : I '
I::' :: : : : :: I I
120
chapter 5 The LC-3
fact that the bits of the instruction literally form the operand. The term immediate comes from the fact that we have the operand immediately, that is, we don't have to look elsewhere for it.
The LC-3 supports five addressing modes: immediate (or literal), register, and three memory addressing modes: PC-relative, indirect, and Base+offset. We will see in Section 5.2 that operate instructions use two addressing modes: register and immediate. We will see in Section 5.3 that data movement instructions use all five modes.
5.1.7 Condition Codes
One final item will complete our overview of the ISA of the LC-3: condition codes. Almost all ISAs allow the instruction sequencing to change on the basis of a previously generated result. The LC-3 has three single-bit registers that are set (set to 1) or cleared (set to 0) each time one of the eight general purpose registers is written. The three single-bit registers are called N, Z, and P, corresponding to their meaning: negative, zero, and positive. Each time a GPR is written, the N, Z, and P registers are individually set to Oor I, corresponding to whether the result written to the GPR is negative, zero, or positive. That is, if the result is negative, the N register is set, and Z and P are cleared. If the result is zero, Z is set and N and P are cleared. Finally, if the result is positive, P is set and N and Z are cleared.
Each of the three single-bit registers is referred to as a condition code because the condition of that bit can be used by one of the control instructions to change the execution sequence. The x86 and SPARC are two examples of ISAs that use condition codes to do this. We show how the LC-3 does it in Section 5.4.
S.2 OperateInstructions
Operate instructions process data. Arithmetic operations (like ADD, SUB, MUL, and DIV) and logical operations (like AND, OR, NOT, XOR) arc common examples. The LC-3 has three operate instructions: ADD, AND, and NOT.
The NOT (opcode = 1001) instruction is the only operate instruction that performs a unary operation, that is, the operation requires one source operand. The NOT instruction bit-wise complements a 16-bit source operand and stores the result of this operation in a destination. NOT uses the register addressing mode for both its source and destination. Bits [8:6] specify the source register and bits [11:9] specify the destination register. Bits [5:0] must contain all ls.
If RS initially contains 0101000011110000, after executing the following instruction:
15 14 13 12 11 10 9 8
I1001011
NOT R3 R5
R3 will contain 101011110000111 l.
76543210 10 11I111
Figure 5.4
Data path relevant to the execution of NOT R3, R5
NOT
A/ ALU
R0 R1 R2 R3 R4 R5 R6 R7
\s
0101000011110000
1010111100001111
11
, '16
• '16
Figure 5.4 shows the key parts of the data path that are used to perform the NITT instruction shown here. Since NITT is a unary operation, only the A input of the ALU is relevant. It is sourced from RS, The control signal to the ALU directs the ALU to perform the bit-wise complement operation. The output of the ALU (the result of the operation) is stored into R3,
The ADD (opcode = 0001) and AND (opcode = 0101) instructions both perform binary operations; they require two 16-bit source operands, The ADD instruction performs a 2's complement addition of its two source operands. The AND instruction performs a bit-wise AND of each pair of bits in its two 16-bit operands. Like the NOT, the ADD and AND use the register addressing mode for one of the source operands and for the destination operand. Bits [8:6] specify the source register and bits [l L:9] specify the destination register (where the result will be written).
'The second source operand for both ADD and AND instructions can be speci- fied by either register mode or as an immediate operand. Bit [5] determines which is used. If bit [5] is 0, then the second source operand uses a register, and bits (2:0] specify which register. In that case, bits [4:3] are set to Oto complete the specification of the instruction.
5.2 Operate Instructions 121
122 chapter 5 The LC-3
For example, if R4 contains the value 6 and RS contains the value -18, then
after the following instruction is executed
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0001001 0000010
ADD RI R4 R5
RI will contain the value -12.
If bit [5] is 1, the second source operand is contained within the instruction.
In fact, the second source operand is obtained by sign-extending bits [4:0] to 16 bits before performing the ADD or AND. Figure 5.5 shows the key parts of the data path that are used to perform the instruction ADD RI , R4, # - 2.
Since the immediate operand in an ADD or AND instruction must fit in bits [4:0] of the instruction, not all 2's complement integers can be imme- diate operands. Which integers are OK (i.e., which integers can be used as immediate operands)?
RO
R1 0000000000000100 R2
-2 R3
R4 0000000000000110
ADD R1 R4
IR I0001 I001 I100111111101
Figure 5.5
Data path relevant to the execution of ADD Rl, R4, #-2
15
RS I SEXTI R6
16
R7
l
0/ , 16
1111111111111110 I
.
/
Bit[S]
'1
\B
ADD A/
ALU
~~d~)~rJf~~,-~~k~,~r/
rc:1T· · ·:1s·o,14;<;1!!,;,,ff :tl··::sto :•~,; if•,,;. •_\f< j• ,;;q;(;zt(
i:{-~;_s/1.;:··.·/
.;'<"'-~+;,,; · )0 E-:;<:;;/t:<: .
::-,.t -5./f~~t~~:f~\r\::’.::J,;:~J::11iS(-(‘. t r,·£i
;>,.2;~}~C/,5IJ~fij~iif:t~IJ~f”·1:; ·;.-;
Example 5.1
Example 5.2
0 :;,Jt!fSWElk R.·.·e.~ts.:.i«_.·_2}~R.l.iai:ea.·..•C(3⁄4~.i~e.ly>.,all_:.·.O..&..l,.l./:.:…j.{.?.i.c.;.f.{;.t.r.:..I.”.~·.·.·.~!.’.f.1.·.·.:..
_,;.,:>;_”/ -,>:o>:·’,._-,.•’<·;::c
S.3 DataMovementInstructions
Data movement instructions move information between the general purpose reg- isters and memory, and between the registers and the input/output devices. We will ignore for now the business of moving information from input devices to registers and from registers to output devices. This will be the major topic of Chapter 8 and an important part of Chapter 9 as well. In this chapter, we will confine ourselves to moving information between memory and the general purpose registers.
5.3 Data Movement Instructions
123
"";,>.rt~::-{,/;/\;:
.;, ::-;-;.:
Example 5.3
•
124
chapter 5 The LC-3
The process of moving information from memory to a register is called a load, and the process of moving information from a register to memory is called a store. In both cases, the information in the location containing the source operand remains unchanged. In both cases, the location of the destination operand is overwritten with the source operand, destroying the prior value in the destination location in the process.
The LC-3 contains seven instructions that move information: LD, LDR, LDI, LEA, ST, STR, and STI.
The format of the load and store instructions is as follows:
15 14 13 12 11 10 9 8 7 6 5 4 3 2 I 0
opcode DR or SR
AddrGen bits
Data movement instructions require two operands, a source and a destination. The source is the data to be moved; the destination is the location where it is moved to. One of these locations is a register, the second is a memory location or an input/output device. As we said earlier, in this chapter the second operand will be assumed to be in memory. We will save for Chapter 8 the cases where the second
operand specifies an input or output device.
Bits [11 :9] specify one of these operands, the register. If the instruction is
a load, DR refers to the destination register that will contain the value after it is read from memory (at the completion of the instruction cycle). If the instruction is a store, SR refers to the register that contains the value that will be written to memory.
Bits [8:0] contain the address generation bits. That is, bits [8:0] encode infor- mation that is used to compute the 16-bit address of the second operand. In the case of the LC-3’s data movement instructions, there are four ways to interpret bits [8:0]. They are collectively called addressing modes. The opcode specifies how to interpret bits [8:0]. That is, the LC-3’s opcode specifies which addressing mode should be used to obtain the operand from bits [8:0] of the instruction.
5.3.1 PC-Relative Mode
LD(opcode= 0010)andST(opcode= 0011)specifythePC-relativeaddressing mode. This addressing mode is so named because bits [8:0] of the instruction specify an offset relative to the PC. The memory address is computed by sign- extending bits [8:0] to 16 bits, and adding the result to the incremented PC. The incremented PC is the contents of the program counter after the FETCH phase; that is, after the PC has been incremented. I f a load, the memory location corresponding to the computed memory address is read, and the result loaded into the register specified by bits [II :9Jof the instruction.
LD R2
is located at x4018, it will cause the contents of x3FC8 to be loaded into R2.
If the instruction
15 14 13 12 11
10 9 8 7 6 5 4 3 2 1 0 Io0 I 0 0 101101011I
xlAF
15 0 RO
IR 0010 010 110101111 LD R2 x1AF
IR[8:0] PC 0100 0000 0001 1001 SEXT
R1
R2 0000000000000101 R3
R4
RS
R6
Figure 5.6
®
Data path relevant to execution of LD R2, xlAF
16
1111111110101111
R7
16
ADD 16
CD
MAR
’16
MDR I
®
Figure 5.6 shows the relevant parts of the data path required to execute this instruction. The three steps of the LD instruction are identified. In step I, the incremented PC (x4019) is added to the sign-extended value contained in IR[8:0] (xFFAF), and the result (x3FC8) is loaded into the MAR. In step 2, memory is read and the contents of x3FC8 are loaded into the MDR. Suppose the value stored in x3FC8 is 5. In step 3, the value 5 is loaded into R2, completing the instruction cycle.
Note that the address of the memory operand is limited to a small range of the total memory. That is, the address can only be within +256 or -255 locations of the LD or ST instruction since the PC is incremented before the offset is added. This is the range provided by the sign-extended value contained in bits (8:0] of the instruction.
5.3.2 Indirect Mode
LOI (opcode = 1010) and STI (opcode= 1011) specify the indirect address- ing mode. An address is first formed exactly the same way as with LD and ST. However, instead of this address being the address of the operand to be loaded or stored, it contains the address of the operand to be loaded or stored. Hence the
MEMORY
5.3 Data Movement Instructions 125
,I
l1
!’
126
chapter 5 The LC-3
name indirect. Note that the address of the operand can be anywhere in the com- puter’s memory, not just within the range provided by bits [8:0) of the instruction as is the case for LD and ST. The destination register for the LDI and the source register for STI, like all the other loads and stores, are specified in bits [11:9] of the instruction.
If the instruction
15 14 13 12 11 10 9 8 7 6 5 4 3 2 I 0
0I00IIIIIO0 I00 LDI R3 xlCC
is in x4AIB, and the contents of x49E8 is x2110, execution of this instruction results in the contents of x2110 being loaded into R3.
Figure 5.7 shows the relevant parts of the data path required to execute this instruction. As is the case with the LD and ST instructions, the first step consists of adding the incremented PC (x4AIC) to the sign-extended value contained in IR[8:0] (xFFCC), and the result (x49E8) loaded into the MAR. In step 2, memory is read and the contents of x49E8 (x21 I0) is loaded into the MDR. In step 3, since x2110 is not the operand, but the address of the operand, it is loaded into the MAR. In step 4, memory is again read, and the MDR again loaded. This time the MDR is loaded with the contents of x2110. Suppose the value -1 is stored in memory location x21 I0. In step 5, the contents of the MDR (i.e., -1) are loaded into R3, completing the instruction cycle.
II
Figure 5.7
Data path relevant to the execution of LDI R3, xlCC
15 0 IA 1010 011 111001100
LOI R3 x1CC
PC 0100 1010 0001 1100
16
CD
MAR
@x2110
IR[B:0]
RO
R1
R2
R3 1111111111111111 R4
16 xFFCC
RS R6 R7
MEMORY
16
MOR
@
5.3.3 Base+offset Mode
LDR (opcode = Oll0) and STR (opcode = 0llI) specify the Base+offset addressing mode. The Base+offset mode is so named because the address of the operand is obtained by adding a sign-extended 6-bit offset to a base register. The 6-bit offset is literally taken from the instruction, bits [5:0]. The base register is specified by bits [8:6] of the instruction.
The Base+offset addressing uses the 6-bit value as a 2’s complement integer between -32 and +31. Thus it must first be sign-extended to 16 bits before it is added to the base register.
If R2 contains the 16-bit quantity x2345, the instruction
15 14 13 12 ll 10 9 8 7 6 5 4 3 2 I 0
II0IOOI0I00III0I LDR RI R2 x!D
loads RI with the contents of x2362.
Figure 5.8 shows the relevant parts of the data path required to execute this
instruction. First the contents of R2 (x2345) are added to the sign-extended value contained in IR[5:0] (x00ID), and the result (x2362) is loaded into the MAR. Second, memory is read, and the contents of x2362 are loaded into the MOR. Suppose the value stored in memory location x2362 is x0F0F. Third, and finally, the contents of the MDR (in this case, x0F0F) are loaded into RI.
Io
Figure 5.8
0
Data path relevant to the execution of LDR RI, R2, xl D
15 0 RO
IR 1010 011 011 01 1101 1 LDR R1 R2 x1D
R1 R2 R3 R4 RS RS R7
MEMORY
0000111100001111 0010001101000101
IR[S:0] SEXT
16 x001D
!
\ADD I •1s
G) MAR I
•1s
I MDR I
®
5.3 Data Movement Instructions 127
128
chapter 5 The LC-3
Note that the Base+offset addressing mode also allows the address of the operand to be anywhere in the computer’s memory.
5.3.4 Immediate Mode
The fourth and last addressing mode used by the data movement instructions is the immediate (or, literal) addressing mode. It is used only with the load effective address (LEA) instruction. LEA (opcode = 1110) loads the register specified by bits [11:9) of the instruction with the value formed by adding the incremented program counter to the sign-extended bits [8:0) of the instruction. The immediate addressing mode is so named because the operand to be loaded into the desti- nation register is obtained immediately, that is, without requiring any access of memory.
The LEA instruction is useful to initialize a register with an address that is very close to the address of the instruction doing the initializing. If memory location x4018 contains the instruction LEA RS, #-3, and the PC contains x4018,
15 14 13 12 11 10 9 111010 LEA RS
8765432 0 1111I1101 -3
RS will contain x4016 after the instruction at x4018 is executed.
Figure 5.9 shows the relevant parts of the data path required to execute the LEA instruction. Note that no access to memory is required to obtain the value
to be loaded.
15 0 IR 111011011 111111101 1
RO
R1
R2
R3
R4
RS 0100000000010110 RS
LEA RS
x1FDI
IR[S:OJ PC I 01 oo 0000 0001 1001 I ISEXTI
16
Figure 5.9
Data path relevant to the execution of LEA R5, #-3
1111111111111101
R7
f1e
\ADD I 16
Again, LEA is the only load instruction that does not access memory to obtain the information it will load into the DR. It loads into the DR the address formed from the incremented PC and the address generation bits of the instruction.
5.3.5 An Example
We conclude our study of addressing modes with a comprehensive example. Assume the contents of memory locations x30F6 through x30FC arc as shown in Figure 5.10, and the PC contains x30F6. We will examine the effects of carrying out the instruction cycle seven consecutive times.
The PC points initially to location x30F6. That is, the content of the PC is the address x30F6. Therefore, the first instruction to be executed is the one stored in location x30F6. The opcode of that instruction is ll 10, which identifies the load effective address instruction (LEA). LEA loads the register specified by bits [11 :9] with the address formed by sign-extending bits [8:01 of the instruction and adding the result to the incremented PC. The 16-bit value obtained by sign- extending bits [8:0] of the instruction is xFFFD. The incremented PC is x30F7. Therefore, at the end of execution of the LEA instruction, RI contains x30F4, and the PC contains x30F7.
The second instruction to be executed is the one stored in location x30F7. The opcode 0001 identifies the ADD instruction, which stores the result of adding the contents of the register specified in bits [8:6] to the sign-extended immediate in bits [4:0] (since bit [5] is 1) in the register specified by bits [11:9]. Since the previous instruction loaded x30F4 into R 1, and the sign-extended immediate value is xOOOE, the value to be loaded into R2 is x3 l 02. At the end of execution of this instruction, R2 contains x3102, and the PC contains x30F8. Rl still contains x30F4.
The third instruction to be executed is stored in x30F8. The opcode 0011 specifics the ST instruction, which stores the contents ofthe register specified by bits [11 :9] of the instruction into the memory location whose address is computed using the PC-relative addressing mode. That is, the address is computed by adding the incremented PC to the 16-bit value obtained by sign-extending bits [8:0] of the instruction. The 16-bit value obtained by sign-.extending bits [8:0] of the instruction is xFFFB. The incremented PC is x30F9. Therefore, at the end of
Address 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
x30F6 1 1
I 0 0 0 1 0 1 1 0 0 1 0 0 1 0 I 1 0 1 0 0
0 1111111101Rl<-PC-3 10001101110R2<-R1+14 101111110I1M[x30F4]<-R2 10010100000R2<-0 10010100101R2<-R2+5
1 0 0 0 1 001110 M[R1+14]<- R2
1 1 1 1 1 110111 R3<- M[M[x3F04]]
x30F7 x30F8 x30F9 x30FA x30FB x30FC
0 0 0 0 0 1 0 0 0 I 1 0
Figure 5.10
Addressing mode example
5.3 Data Movement Instructions 129
130
chapter 5 The LC-3
execution of the ST instruction, memory location x30F4 contains x3102, and the PC contains x30F9.
At x30F9, we find the opcode 0101, which represents the AND instruction. After execution, R2 contains the value 0, and the PC contains x30FA.
At x30FA, we find the opcode 0001, signifying the ADD instruction. After execution, R2 contains the value 5, and the PC contains x30FB.
At x30FB, we find the opcode 0111, signifying the STR instruction. The STR instruction (like the LDR instruction) uses the Base+offset addressing mode. The memory address is obtained by adding the contents of the register specified by bits [8:6] (the BASE register) to the sign-extended offset contained in bits [5:0]. In this case, bits [8:6] specify RI. The contents of RI are still x30F4. The 16-bit sign-extended offset is x000E. Since x30F4 + x000E is x3102, the memory address is x3102. The STR instruction stores into x3102 the contents of the register specified by bits [11:9], that is, R2. Recall that the previous instruc- tion (at x30FA) stored the value 5 into R2. Therefore, at the end of execution of this instruction, location x3102 contains the value 5, and the PC contains x30FC.
At x30FC, we find the opcode 1010, signifying the LDT instruction. The LDI instruction (like the STI instruction) uses the indirect addressing mode. The memory address is obtained by first forming an address as is done in the PC- relative addressing mode. In this case, the 16-bit value obtained by sign-extending bits [8:0] of the instruction is xFFF7. The incremented PC is x30FD. Their sum is x30F4, which is the address of the operand address. Memory location x30F4 contains x3102. Therefore, x3102 is the operand address. The LDI instruction loads the value found at this address (in this case 5) into the register identified by bits [11 :9] of the instruction (in this case R3). At the end of execution of this instruction, R3 contains the value 5 and the PC contains x30FD.
S.4 ControlInstructions
Control instructions change the sequence of the instructions that are executed. If there were no control instructions, the next instruction fetched after the current instruction finishes would be the instruction located in the next sequential memory location. As you know, this is because the PC is incremented in the FETCH phase of each instruction. We will see momentarily that it is often useful to be able to break that sequence.
The LC-3 has five opcodes that enable this sequential flow to be broken: con- ditional branch, unconditional jump, subroutine (sometimes called function) call, TRAP, and return from interrupt. In this section, we will deal almost exclusively with the most common control instruction, the conditional branch. We will also introduce the unconditional jump and the TRAP instruction. The TRAP instruc- tion is particularly useful because, among other things, it allows a programmer to get information into and out of the computer without fully understanding the intricacies of the input and output devices. However, most of the discussion of the
TRAP instruction and all of the discussion of the subroutine call and the return from interrupt we will leave for Chapters 9 and 10.
5.4.1 Conditional Branches
The format of the conditional branch instruction (opcode= 0000) is as follows:
15 14 13 12 11 10 9 8 7 6 5 4 3 2 I 0 1 ooo0INIZIPI PCoffset
Bits [11], [IO], and [9] correspond to the three condition codes discussed in Section 5.1.7. Recall that in the LC-3, all instructions that write values into the general purpose registers set the three condition codes (i.e., the single-bit registers N, Z, P) in accordance with whether the value written is negative, zero, or positive. These instructions are ADD, AND, NOT, LD, LDI, LDR, and LEA.
The condition codes are used by the conditional branch instruction to deter- mine whether to change the instruction flow; that is, whether to depart from the usual sequential execution of instructions that we get as a result of incrementing PC during the FETCH phase of each instruction.
The instruction cycle is as follows: FETCH and DECODE are the same for all instructions. The PC is incremented during FETCH. The EVALUATE ADDRESS phase is the same as that for LD and ST: the address is computed by adding the incremented PC to the 16-bit value formed by sign-extending bits [8:0] of the instruction.
During the EXECUTE phase, the processor examines the condition codes whose corresponding bits in the instruction are I. That is, ifbit [11] is I, condition code N is examined. If bit [IO] is I, condition code Z is examined. If bit [9] is I, condition code P is examined. If any of bits [11:9] are 0, the corresponding condition codes are not examined. I f any of the condition codes that are examined are in state I, then the PC is loaded with the address obtained in the EVALUATE ADDRESS phase. If none of the condition codes that are examined are in state I, the PC is left unchanged. In that case, in the next instruction cycle, the next sequential instruction will be fetched.
For example, if the last value loaded into a general purpose register was 0, then the current instruction (located at x4027) shown here
15 14 13 12 11 109876 5 4 3 2 1 0
00oIoI0I01101I001 BR nzp x0D9
would load the PC with x4101, and the next instruction executed would be the one at x4 l 01, rather than the instruction at x4028.
Figure 5.11 shows the data palh elements that are required to execute this instruction. Note the logic required to determine whether the sequential instruction flow should be broken. In this case the answer is yes, and the PC is loaded with x4l0l, replacing x4028, which had been loaded during the FETCH phase of the conditional branch instruction.
If all three bits [11:9] are I, then all three condition codes are examined. In this case, since the last result stored into a register had to be either negative, zero, or positive (there arc no other choices), one of the three condition codes must be in state I. Since all three are examined, the PC is loaded with the address obtained in the EVALUATE ADDRESS phase. We call this an unconditional branch since
Io
5.4 Control Instructions 131
I
132
chapter 5
The LC-3
Figure 5.11
Data path relevant to the execution of BRz xo D9
z
Yes!
0100 0001 0000 0001
PC 0100 0000 0010 1000
IR
BR N Z P PCoffset9 0 011011001
9
SEXT
16 16 0000000011011001
ADD 16
the instruction flow is changed unconditionally, that is, independent of the data that is being processed.
For example, if the following instruction,
15 14 13 12 11 10 9 8 7 6 5 4 3 2 0
Io000III1II 0000 0I BR nzp xl85
located at x507B, is executed, the PC is loaded with x500 I.
What happens if all three bits [I I :9) in the BR instruction are O?
•
5.4.2 An Example
We are ready to show by means of a simple example the value of having control instructions in the instruction set.
Suppose we know that the 12 locations x3100 to x3 lOB contain integers, and we wish to compute the sum of these 12 integers.
Figure 5.12
R1 <-x3100 R3 <-0 R2 <-12
Yes
~----< R2?=0
No
R4 <-M[R1] R3 <-R3 + R4 Increment R1 Decrement R2
An algorithm for adding 12 integers
A flowchart for an algorithm to solve the problem is shown in Figure 5.12.
First, as in all algorithms, we must initialize our variables. That is, we must set up the initial values of the variables that the computer will use in executing the program that solves the problem. There are three such variables: the address of the next integer to be added (assigned to Rl), the running sum (assigned to R3), and the number of integers left to be added (assigned to R2). The three variables are initialized as follows: The address of the first integer to be added is put in RI. R3, which will keep track of the running sum, is initialized to 0. R2, which will keep track of the number of integers left to be added, is initialized to 12. Then the process of adding begins.
The program repeats the process of loading into R4 one of the 12 integers, and adding it to R3. Each time we perform the ADD, we increment RI so it will point to (i.e., contain the address of) the next number to be added and decrement R2 so we will know how many numbers still need to be added. When R2 becomes zero, the Z condition code is set, and we can detect that we are done.
The IO-instruction program shown in Figure 5.13 accomplishes the task.
The details of the program execution are as follows: The program starts with PC = x3000. The first instruction (at location x3000) loads RI with the address x3100. (The incremented PC is x3001; the sign-extended PCoffset is x00FF.)
The instruction at x3001 clears R3. R3 will keep track of the running sum, so it must start off with the value 0. As we said previously, this is called initializing the SUM to zero.
The instructions at x3002 and x3003 set the value ofR2 to 12, the number of integers to be added. R2 will keep track of how many numbers have already been added. This will be done (by the instruction contained in x3008) by decrementing R2 after each addition takes place.
The instruction at x3004 is a conditional branch instruction. Note that bit [10] is a I. That means that the Z condition code will be examined. If it is set, we know
5.4 Control Instructions 133
H,.
1·
,, ,,
134
chapter 5 The LC-3
Figure 5.13
A program that implements the algorithm of Figure 5.12
R2 must have just been decremented to 0. That means there are no more numbers to be added and we are done. If it is clear, we know we still have work to do and we continue.
The instruction at x3005 loads the contents of x3 \00 (i.e., the first integer) into R4, and the instruction at x3006 adds it to R3.
The instructions at x3007 and x3008 perform the necessary bookkeeping. The instruction at x3007 increments R 1, so RI will point to the next location in memory containing an integer to be added (in this case, x3101). The instruction at x3008 decrements R2, which is keeping track of the number of integers still to be added, as we have already explained, and sets the N, Z, and P condition codes.
The instruction at x3009 is an unconditional branch, since bits [11:9] are all 1. It loads the PC with x3004. lt also does not affect the condition codes, so the next instruction to be executed (the conditional branch at x3004) will be based on the instruction executed at x3008.
This is worth saying again. The conditional branch instruction at x3004 fol- lows the instruction at x3009, which does not affect condition codes, which in tum follows the instruction at x3008. Thus, the conditional branch instruction at x3004 will be based on the condition codes set by the instruction at x3008. The instruction at x3008 set~ the condition codes depending on the value produced by decrementing R2. As long as there are still integers to be added, the ADD instruction at x3008 will produce a value greater than zero and therefore clear the Z condition code. The conditional branch instruction al x3004 examines the
Z condition code. As long as Z is clear, the PC will not be affected, and the next instruction cycle will start with an instruction fetch from x3005.
The conditional branch instruction causes the execution sequence to follow: x3000,x3001,x3002,x3003,x3004,x3005,x3006,x3007,x3008,x3009,x3004, x3005, x3006, x3007, x3008, x3009, x3004, x3005, and so on until the value in R2 becomes 0. The next time the conditional branch instruction at x3004 is executed, the PC is loaded with x300A, and the program continues at x300A with its next activity.
Finally, it is worth noting that we could have written a program to add these 12 integers without any control instructions. We still would have needed the LEA
Address 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
x3000 1110
0 1011111111Rl<-3100
x3001 0
x3002 0
x3003 0
x3004 0
x3005 0
x3006 0
x3007 0
x3008 0
x3009 0
0
0
0
0
0
I
011
0 0 1001100001Rl<-Rl+l 0 1
1 0 1 1 0 1 0 0 1
0 0 0 1 1 0
1
1
1011100000
R3 <- 0
R2 <- 0
R2 <- 12 BRz x300A R4 <- M[Rl] R3 <- R3+R4
00 00
0 0
0 0 0
1 1
0 0 1 0 1 1 1 1 1 1 R2 <- R2-l 1 1 1 1 1 1 1 1 1 0 1 0 BRnzp x3004
1
1
1 0
0 1 0 1 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 1 1 0 0 oI1 0 0
0 0
0 0
instruction in x3000 to initialize RI. We would not have needed the instruction at x3001 to initialize the running sum, nor the instructions at x3002, and x3003 to initialize the number of integers left to be added. We could have loaded the contents of x3100 directly into R3, and then repeatedly (by incrementing RI, loading the next integer into R4, and adding R4 to the running sum in R3) added the remaining 11 integers. After the addition of the twelfth integer, we would go on to the next task, as does the example of Figure 5. 13 with the branch instruction in x3004.
Unfortunately, instead of a I0-instruction program, we would have had a 35- instruction program. Moreover, if we had wished to add 100 integers without any control instructions instead of 12, we would have had a 299-instruction program instead of 10. The control instructions in the example of Figure 5.13 permit the reuse of sequences of code by breaking the sequential instruction execution flow.
5.4.3 Two Methods for Loop Control
We use the term loop to describe a sequence of instructions that get executed again and again under some controlling mechanism. The example of adding 12 integers contains a loop. Each time the body of the loop executes, one more integer is added to the running total, and the counter is decremented so we can detect whether there are any more integers left to add. Each time the loop body executes is called one iteration of the loop.
There are two common methods for controlling the number of iterations of a loop. One method we just examined: the use of a counter. If we know we wish to execute a loop n times, we simply set a counter ton, then after each execution of the loop, we decrement the counter and check to see if it is zero. If it is not zero, we set the PC to the start of the loop and continue with another iteration.
A second method for controlling the number of executions of a loop is to use a sentinel. This method is particularly effective if we do not know ahead of time how many iterations we will want to perform. Each iteration is usually based on processing a value. We append to our sequence of values to be processed a value that we know ahead of time can never occur (i.e., the sentinel). For example, if we are adding a sequence of numbers, a sentinel could be a # or a *, that is, something that is not a number. Our loop test is simply a test for the occurrence of the sentinel. When we find it, we know we are done.
5.4.4 Example: Adding a Column of Numbers Using a Sentinel
Suppose in our example of Section 5.4.2, we know the values stored in locations x3100 to x310B are all positive. Then we could use any negative number as a sentinel. Let's say the sentinel stored at memory address x310C is -1. The resulting flowchart for the program is shown in Figure 5.I4 and the resulting program is shown in Figure 5.15.
As before, the instruction at x3000 loads RI with the address of the first value to be added, and the instruction at x3001 initializes R3 (which keeps track of the sum) to 0.
5.4 Control Instructions 135
136
chapter 5
The LC-3
Figure 5.15
A program that implements the algorithm of Figure 5.14
Figure 5.14
Rl <-x3100 R3 <-0 R4 <- M[Rl]
Yes R4 ?= ~ ~ --< Sentinel
No
R3 <-R3 + R4 Increment Rl R4 <-M[Rl]
An algorithm showing the use of a sentinel for loop control
Address1514131211 1098765432I0
x3000 I I I 0 0 0
x3001 0 I 0 I 0 I
x3002 0 I I 0 I 0
x3003 0 0 0 0 I 0
x3004 0 0 0 I 0 I
x3005 0 0 0 I 0 0
x3006 0 I I 0 1 0
x3007 0 0 0 0 I I I I I I
I I I
I IO 0 0 0 0 010 II0 0 0 I I
I 0 I I I 0 I I 0 0 0 I 0 0 0 0 I 0 I I I 0 0 I 0 0 0 I
I I I Rl<- x3100 0 0 0 0 R3 <- 0
0 0 0 0 R4 <- M[Rl] 0 I 0 0 BRn x3008 0II 0 0 R3 <- R3+R4 0 0 0 I Rl <- Rl+l 0 0 0 0 R4 <- M[Rl] I 0 I I BRnzp x3003
At x3002, we load the contents of the next memory location into R4. If the sentinel is loaded, the N condition code is set.
The conditional branch at x3003 examines the N condition code, and if it is set, sets PC to x3008 and onto the next task to be done. If the N condition code is clear, R4 must contain a valid number to be added. In this case, the number is added to R3 (x3004), RI is incremented to point to the next memory location (x3005), R4 is loaded with the contents of the next memory location (x3006), and the PC is loaded with x3003 to begin the next iteration (x3007).
5.4.5 The JMP Instruction
The conditional branch instruction, for all its capability, does have one unfortunate limitation. The next instruction executed must be within the range of addresses that can be computed by adding the incremented PC to the sign-extended offset
obtained from bits [8:0] of the instruction. Since bits [8:0] specify a 2's comple- ment integer, the next instruction executed after the conditional branch can be at most +256 or -255 locations from the branch instruction itself. What if we would like to execute next an instruction that is 1,000 locations from the current instruction. We cannot fit the value 1,000 into the 9-bit field; ergo, the conditional branch instruction does not work.
The LC-3 ISA does provide an instruction JMP (opcode= llOO) that can do the job. An example follows:
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 II I 0 oIo 0 0010000000
JMP R2
The JMP instruction loads the PC with the contents of the register specified by bits I8:6] of the instruction. If this JMP instruction is located at address x4000, R2 contains the value x6600, and the PC contains x4000, then the instruction at x4000 (the JMP instruction) will be executed, followed by the instruction located at x6600. Since registers contain 16 bits, the full address space of memory, the JMP instruction has no limitation on where the next instruction to be executed must reside.
5.4.6 The TRAP Instruction
Finally, because it will be useful long before Chapter 9 to get data into and out of the computer, we introduce the TRAP instruction now. The TRAP (opcode =
ll 11) instruction changes the PC to a memory address that is part of the operating system so that the operating system will perform some task in behalf of the program that is executing. In the language of operating system jargon, we say the TRAP instruction invokes an operating system SERVICE CALL. Bits [7:0] of the TRAP instruction form the trapvector, which identifies the service call that the program wishes the operating system to perform. Table A.2 contains the trapvectors for all the service calls that we will use with the LC-3 in this book.
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 I1 1 I II0 0001 trapvector
Once the operating system is finished performing the service call, the program counter is set lo the address of the instruction following the TRAP instruction, and the program continues. In this way. a program can, during its execution, request services from the operating system and continue processing after each such service is performed. The services we will require for now are
* Input a cha::cact.er from the keyboard (trapvector = x2 3) . * Output a character to the monitor (_trapvector = x21) 0
* Halt the program (trapvector = x2S).
Exactly how the LC-3 carries out the interaction between operating system and executing programs is an important topic for Chapter 9.
5.4 Control Instructions 137
138
chapter 5 The LC-3
5.5 HnotherExample:CountingOccurrencesofaCharacter
We will finish our introduction to the ISA of the LC-3 with another example program. We would like to be able to input a character from the keyboard and then count the number of occurrences of that character in a file. Finally, we would like to display that count on the monitor. We will simplify the problem by assuming that the number of occurrences of any character that we would be interested in is small. That is, there will be at most nine occurrences. This simplification allows us to not have to worry about complex conversion routines between the binary count and the ASCII display on the monitor-a subject we will get into in Chapter I0, but not today.
Figure 5.16 is a flowchart of the algorithm that solves this problem. Note that each step is expressed both in English and also (in parentheses) in terms of an LC-3 implementation.
The first step is (as always) to initialize all the variables. This means providing starting values (called initial values) for RO, RI, R2, and R3, the four registers the computer will use to execute the program that will solve the problem. R2 will keep track of the number of occurrences; in Figure 5.16, it is referred to as count. It is initialized to zero. R3 will point to the next character in the file that is being examined. We refer to it as pointer since it contains the address of the location where the next character of the file that we wish to examine resides. The pointer is initialized with the address of the first character in the file. RO will hold the
character that is being counted; we will input that character from the keyboard and put it in RO. RI will hold, in turn, each character that we get from the file being examined.
We should also note that there is no requirement that the file we are examining be close .to or far away from the program we are developing. For example, it is perfectly reasonable for the program we are developing to start at x3000, and the file we are examining to start at x9000. If that were the case, in the initialization process, R3 would be initialized to x9000.
The next step is to count the number of occurrences of the input character. This is done by processing, in turn, each character in the file being examined, until the file is exhausted. Processing each character requires one iteration of a loop. Recall from Section 5.4.3 that there are two common methods for keeping track of iterations of a loop. We will use the sentinel method, using the ASCII code for EOT (End of Text) (00000100) as the sentinel. A table of ASCII codes is in Appendix E.
In each iteration of the loop, the contents of RI are first compared to the ASCII code for EOT. If they are equal, the loop is exited, and the program moves on to the final step, displaying on the screen the number of occurrences. If not, there is work to do. RI (the current character under examination) is compared to RO (the character input from the keyboard). If they match, R2 is incremented. In either case, we get the next character, that is, R3 is incremented, the next character is loaded into RI, and the program returns to the test that checks for the sentinel at the end of the file.
When the end of the file is reached, all the characters have been examined, and the count is contained as a binary number in R2. In order to display the
Figure 5.16
An algorithm to count occurrences of a character
5.5 Another Example: Counting Occurrences of a Character 139 Count <-0
(R2 <-0)
Initialize pointer (R3 <- M[x3012])
Input char from keyboard (TRAP x23)
Y es
Y e s
Get char from file (R1 <-M(R3])
Done (R1? = EOT)
No
M a t c h No (R1 ? = RO)
Increment count
(R2 <- R2 +1)
Get char from file (R3 <- R3 +1 R1 <-M[R3])
Prepare output (RO <- R2 + x30)
Output (TRAP x21)
Stop (TRAP x25)
140
chapter 5 The LC-3
Figure 5.17
0 0
A machine language program that implements the algorithm of Figure 5.16
Address 15 14 13 12 11 10 9 8 7 6 5 4 3 2 I 0
x3000 0
x3001 0
x3002 I
x3003 0
x3004 0
x3005 0
x3006 I
x3007 0
x3008 0
x3009 0
x300A 0
x300B 0
x300C 0
x300D 0
x300E 0
x300F 0
x3010 I
x3011 I
I 0 0I II 1 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 I I 0 0 0 l 0 0 I I I I
I 0 0 I 0 I 0 0 0 0 0 II0000I0000 0001001000II 0 I 0 I I 0 0 0 0 0 0 I l 0 0 0 0 l I I l I 0 0 I 0 0 0 0 0 0 I 0 0 0 0 I 0 0 I I I l I I I 0 I 0 0 I I 0 0 0 0 I
R2 <- 0
R3 <- M[x3012] TRAP x23
Rl <- M[R3]
R4 <- Rl-4
BRz x300E
Rl <- NOT Rl Rl <- Rl + 1
x3012 x3013 0
Starting address of file
0 0 000001I0000ASCIITEMPLATE
I 0 0 0 I 0 0 0
0 0 I 0 I 0 I 0
0 I00I0001000Rl<-Rl+RO 0 I00000000IBRnpx300B
I 0 0 I 0 I 0 0 0 0 I R2 <- R2 + 1 1 I 0 I I I 0 0 0 0 I R3 <- R3 + 1 0 I 0 I I 0 0 0 0 0 0 Rl <- M[R3]
0 I I 0 I 0 0 0
0 1 1 1 I I I I I 0 I I 0 BRnzp x3004
0 0 I 0 I 0 I 0
0 0 0 0 0 0 0 0 I 0 0 RO<- M[x3013] 0 0 0 0 0 0 0 010 I 0 RO <- RO + R2 000100I0000ITRAPx21 000100I00I0ITRAPx25
count on the monitor, it is necessary to first convert it to an ASCII code. Since we have assumed the count is less than 10, we can do this by putting a leading 0011 in front of the 4-bit binary representation of the count. Note in Figure E.2 the relationship between the binary value of each decimal digit between 0 and 9 and its corresponding ASCII code. Finally, the count is output to the monitor, and the program terminates.
Figure 5.17 is a machine language program that implements the flowchart of Figure 5. 16.
First the initialization steps. The instruction at x3000 clears R2 by ANDing it with x0000; the instruction at x3001 loads the value stored in x3012 into R3. This is the address of the first character in the file that is to be examined for occurrences of our character. Again, we note that this file can be anywhere in memory. Prior to starting execution at x3000, some sequence of instructions must have stored the first address of this file in x3012. Location x3002 contains the TRAP instruction, which requests the operating system to perform a service call on behalf of this program. The function requested, as identified by the 8-bit trapvector 0010001 I (or, x23), is to input a character from the keyboard and load it into RO. Table A.2 lists trapvectors for all operating system service calls that can be performed on behalf of a user program. Note (from Table A.2) that x23 directs the operating system to perform the service call that reads the next character struck and loads it into RO. The instruction at x3003 loads the character pointed to by R3 into RJ.
Then the process of examining characters begins. We start (x3004) by sub- tracting 4 (the ASCH code for EOT) from RI, and storing it in R4. If the result
is zero, the end of the file has been reached, and it is time to output the count. The instruction at x3005 conditionally branches to x300E, where the process of outputting the count begins.
If R4 is not equal to zero, the character in RI is legitimate and must be examined. The sequence of instructions at locations x3006, x3007, and x3008 determine if the contents of RI and RO are identical. The sequence of instructions perform the following operation:
RO+ (NOT (RI)+ 1)
This produces all zeros only if the bit patterns of RI and RO are identical. If the bit patterns are not identical, the conditional branch at x3009 branches to x300B, that is, it skips the instruction x300A, which increments R2, the counter.
The instruction at x300B increments R3, so it will point to the next character in the file being examined, the instruction at x300C loads that character into RI, and the instruction at x300D unconditionally takes us back to x3004 to start processing that character.
When the sentinel (EOT) is finally detected, the process of outputting the count begins (al x300E). The instruction at x300E loads 00110000 into RO, and the instruction at x300F adds the count to RO. This converts the binary represen- tation of the count (in R2) to the ASCII representation of the count (in RO). The instruction at x30 IO invokes a TRAP to the operating system to output the con- tents of RO on the monitor. When that is done and the program resumes execution, the instruction at x3011 invokes a TRAP instruction to terminate the program.
S.6 TheDataPathRevisited
Before we leave Chapter 5, let us revisit the data path diagram that we first encountered in Chapter 3 (Figure 3.33). Now we are ready lo examine all the structures that are needed lo implement the LC-3 ISA. Many of them we have seen earlier in this chapter in Figures 5.4, 5.5, 5.6, 5.7, 5.8, 5.9, and 5.11. We reproduce this diagram as Figure 5.18. Note at the outset that there are two kinds of arrows in the data path, those with arrowheads filled in, and those with arrowheads not filled in. Filled-in arrowheads designate information that is processed. Unfilled- in arrowheads designate control signals. Control signals emanate from the block labeled "Control." The connections from Control to most control signals have been left off Figure 5.18 to reduce unnecessary clutter in the diagram.
5.6.1 Basic Components of the Data Path
The Global Bus
You undoubtedly first notice the heavy black structure with arrowheads at both ends. This represents the data path's global bus. The LC-3 global bus consists of 16 wires and associated electronics. It allows one structure to transfer up lo 16 bits of information to another structure by making the necessary electronic connections on the bus. Exactly one value can be transferred on the bus at one time. Note that each structure that supplies values to the bus has a triangle just
5.6 The Data Path Revisited 141
142
chapter 5
The LC-3
GateMARMUX
GatePC
ZEXT [7:0]
ADDR2MUX
SR2
O U T
16
SRI
O U T
16
3
SRI
[10:0]
0 16 SEXT
[4:0]
[8:0]
SEXT
[5:0J
SEXT
LD.IR
GateMDR 16
~----<>1SRZMUX
LD.MDR
MDR
LD.MAR
Figure 5.18
The data path of the LC-3
2
SEXT
16
MARMUX
LD.PC
16
REG
FILE
16
16
DR
SR2
16 16 16 16
16
JR
16
16
~ —–> 1 GateALU
MEMORY
INPUT
OUTPUT
MEM.EN,R.W
16
+
16
R
LD.CC
16
MAR
[6
FINITE S T A T E
MACHINE’–_ _z’—
RI < R2.
6.5 Which of the two algorithms for multiplying two numbers is preferable and why? 88 · 3 = 88 + 88 + 88 OR 3 + 3 + 3 + 3 + ... + 3?
6.6 Use your answers from Exercises 6.3 and 6.4 to develop a program that efficiently multiplies two integers and places the result in R3. Show the complete systematic decomposition, from the problem statement to the final program.
6.7
What does the following LC-3 program do?
6.8
6.9 6.10 6.11
to initialize R2 in the character counting example other words, in what manner might the program
6.12 a.
b.
Write an LC-3 machine language routine that echoes the last character typed at the keyboard. If the user types an R, the program then immediately outputs an R on the screen.
Expand the routine from part a such that it echoes a line at a time. For example, if the user types:
The quick brown fox jumps over the lazy dog.
then the program waits for the user to press the Enter key (the ASCII code for which is xOA) and then outputs the same line.
1110 0000 0000 1100 1110 0010 0001 0000 0101 0100 1010 0000 0010 0100 0001 0011 0110 0110 0000 0000 0110 1000 0100 0000 0001 0110 1100 0100 0111 0110 cooo 0000 0001 0000 0010 0001 0001 0010 0110 0001 0001 0100 1011 1111 0000 0011 1111 1000 1111 0000 0010 0101 0000 0000 0000 0101 0000 0000 0000 0100 0000 0000 0000 0011
0000 0000 0000 0110 0000 0000 0000 0010 0000 0000 0000 0100 0000 0000 0000 0111 0000 0000 0000 0110 0000 0000 0000 1000 0000 0000 0000 0111 0000 0000 0000 0101
Why is it necessary
in Section 6.1.4? In
behave incorrectly if the R2 ~ 0 step were removed from the routine?
Using the iteration construct, write an LC-3 machine language routine that displays exactly 100 Zs on the screen.
Using the conditional construct, write an LC-3 machine language routine that determines if a number stored in R2 is odd.
Write an LC-3 machine language routine to increment each of the numbers stored in memory location A through memory location B. Assume these locations have already been initialized with meaningful numbers. The addresses A and B can be found in memory locations x3100 and x3101.
x3001 x3002 x3003 x3004 x3005 x3006 x3007 x3008 x3009 x300A x300B x300C x300D x300E x300F x3010 x3011 x3012 x3013
x3014 x3015 x3016 x3017 x3018
Exercises 173
174
chapter 6 Programming
6.13 Notice that we can shift a number to the left by one bit position by adding it to itself. For example, when the binary number 0011 is added to itself, the result is 0110. Shifting a number one bit pattern to the right is not as easy. Devise a routine in LC-3 machine code to shift the contents of memory location x3100 to the right by one bit.
6.14 Consider the following machine language program:
X3000 0101 0100
x3001 0001 0010 0111
x3002 0001 0010 0111
x3003 0001 0010
x3004 0000 1000 0000
x3005 0001 0100 1010
x3006 0000 1111 1111
x3007 1111 0000 0010
What are the possible initial values of RI that cause the final value in R2 to be 3?
6.15 Shown below are the contents of memory and registers before and after the LC-3 instruction at location x3010 is executed. Your job: Identify the instruction stored in x3010. Note: There is enough information below to uniquely specify the instruction at x3010.
RO: RI: R2: R3: R4: R5: R6: R7:
...
x3400: x3401: x3402: x3403: x3404: x3405: x3406: x3407: x3408: ...
Before x3208
x2d7c xe373 x2053 x33ff x3flf xf4a2 x5220
x3001 x7a00 x7a2b xa700 xfOI I x2003 x3lba xc!OO
xefef
After
x3208 x2d7c xe373 x2053 x33ff x3flf xf4a2 x5220
x3001 x7a00 x7a2b xa700 xfOl l x2003 xe373 xc!OO
xefef
1010 0000 1111 1111 0111 1111 0010 0001 1010 0101
6.16
An LC-3 program is located in memory locations x3000 to x3006. It starts executing at x3000. If we keep track of all values loaded into the MAR as the program executes, we will get a sequence that starts as follows. Such a sequence of values is referred to as a trace.
MAR Trace x3000 x3005 x3001 x3002 x3006 x4001 x3003 x0021
We have shown below some of the bits stored in locations x3000 to x3006. Your job is to fill in each blank space with a 0 or a 1, as appropriate.
6.17
Shown below are the contents of registers before and after the LC-3 instruction at location x3210 is executed. Your job: Identify the instruction stored in x3210. Note: There is enough information below to uniquely specify the instruction at x3210.
x3000 0 0 l 0 0 0 0 x3001 0 0 0 I 0 0 0
0 0 0 I 0 0
00I
x3002 x3003 x3004 x3005 x3006
I 0 I I 0 0 0
I I I I 0 0 0 0 0 0 0 0 0 0
0 0 0 I 0 0 0 0 I I
0I0I 0000
Before xFFID x301C x2Fl l x5321 x331F xlF22 xOIFF x341F x3210
After xFFID x301C x2Fl l x5321 x331F xlF22 xOIFF x321 I x3220
RO:
RI:
R2:
R3:
R4:
R5:
R6:
R7:
PC:
N:0 0 Z:I I P: 0 0
Exercises 175
17b
chapter 6 Programming
6.18
6.19
The LC-3 has no Divide instruction. A programmer needing to divide two numbers would have to write a routine to handle it. Show the systematic decomposition of the process of dividing two positive integers. Write an LC-3 machine language program starting at location x3000 which divides the number in memory location x4000 by the number in memory location x4001 and stores the quotient at x5000 and the remainder al x500I.
It is often necessary to encrypt messages to keep them away from prying eyes. A message can be represented as a string of ASCII characters, one per memory location, in consecutive memory locations. Bits [15:8] of each location contains 0, and the location immediately following the string contains xOO00.
A student who has not taken this course has written the following LC-3 machine language program to encrypt the message starting at location x4000 by adding 4 to each character and storing the resulting message at x5000. For example, if the message at x4000 is "Matt," then the encrypted message at x5000 is "Qeyy." However, there are four bugs in his code. Find and correct these errors so that the program works correctly.
6.20
Redo Exercise 6.18 for all integers, not just positive integers.
x3000 1110
x3001 0010
x3002 0110
x3003 0000
x3004 0001
x3005 0111
x3006 0001
x3007 0001
x3008 0000 1001
x3009 0110
0000 0000 1010
0010 0000
0100 0000 0000 0100 0000 0101 0100 1010 0101 0100 0100 0000 0000 0010 0001 0010 0110 0001
1111 1001 0100 0100 0000
x300A 1111
x300B 0100
x300C 0101 0000 0000 0000
0000 0010
0000 0000 0000
1010
0101
Rssemblq Language
By now, you are probably a little tired of ls and Os and keeping track of 0001 meaning ADD and 100 I meaning NOT. Also, wouldn't it be nice if we could refer to a memory location by some meaningful symbolic name instead o f memorizing its 16-bit address? And wouldn't it be nice if we could represent each instruction in some more easily comprehensible way, instead of having to keep track of which bit of the instruction conveys which individual piece of information about the instruction. It turns out that help is on the way.
In this chapter, we introduce assembly language, a mechanism that does all that, and more.
7.1 nssembl~LanguageProgramming-MovingUpaLevel
Recall the levels of transformation identified in Figure 1.6 of Chapter I. Algo- rithms are transformed into programs described in some mechanical language. This mechanical language can be, as it is in Chapter 5, the machine language of a particular computer. Recall that a program is in a computer's machine language if every instruction in the program is from the ISA of that computer.
On the other hand, the mechanical language can be more user-friendly. We generally partition mechanical languages into two classes, high-level and low- level. Of the two, high-level languages are much more user-friendly. Examples are C, C++, Java, Fortran, COBOL, Pascal, plus more than a thousand others. Instructions in a high-level language almost (but not quite) resemble statements in a natural language such as English. High-level languages tend to be ISA inde- pendent. That is, once you learn how to program in C (or Fortran or Pascal)
Ch~pte
7
178
chapter 7 Assembly Language
for one ISA, it is a small step to write programs in C (or Fortran or Pascal) for another ISA.
Before a program written in a high-level language can be executed, it must be translated into a program in the ISA of the computer on which it is expected to exe- cute. It is usually the case that each statement in the high-level language specifies several instructions in the ISA of the computer. In Chapter 11, we will introduce the high-level language C, and in Chapters 12 through 19, we will show the rela- tionship between various statements in C and their corresponding translations in LC-3 code. Tn this chapter, however, we will only move up a small notch from the ISA we dealt with in Chapter 5.
A small step up from the ISA ofa machine is that ISA's assembly language. Assembly language is a low-level language. There is no confusing an instruction in a low-level language with a statement in English. Each assembly language instruction usually specifies a single instruction in the ISA. Unlike high-level languages, which are usually ISA independent, low-level languages arc very much ISA dependent. In fact, it is usually the case that each ISA has only one assembly language.
The purpose of assembly language is to make the programming process more user-friendly than programming in machine language (i.e., the ISA of the com- puter with which we are dealing), while still providing the programmer with detailed control over the instructions that the computer can execute. So, for exam- ple, while still retaining control over the detailed instructions the computer is to carry out, we are freed from having to remember what opcode is 0001 and what opcode is 1001, or what is being stored in memory location 0011111100001010 and what is being stored in location 0011111100000101. Assembly languages let us use mnemonic devices for opcodes, such as ADD and NOT, and they let us give
meaningful symbolic names to memory locations, such as SUM or PRODUCT, rather than use their 16-hit addresses. This makes it easier to differentiate which memory location is keeping track ofa SUM and which memory location is keeping track of a PRODUCT. We call these names symbolic addresses.
We will see, starting in Chapter 11, that when we take the larger step ofmoving up to a higher-level language (such as C), programming will be even more user- friendly, but we will relinquish control of exactly which detailed instructions are to be carried out in behalf of a high-level language statement.
7.2 RnRssemblijLanguageProgram
We will begin our study of the LC-3 assembly language by means of an example. The program in Figure 7.1 multiplies the integer intially stored in NUMBER by 6 by adding the integer to itself six times. For example, if the integer is 123, the program computes the product by adding 123 + 123 + 123 + 123 + 123 + 123.
The program consists of 21 lines of code. We have added a line number to each line of the program in order to be able to refer to individual lines easily. This is a common practice. These line numbers are not part of the program. Ten lines start with a semicolon, designating that they are strictly for the benefit of the human reader. More on this momentarily. Seven lines (06, 07, 08, 0C, OD,
01
02
03
04
05
06
07
08
O9
OA ; The inner loop OB
oc AGAIN ADD
Program to multiply an integer by the constant 6. Before execution, an integer must be stored in NUMBER.
.ORIG x3050 LD Rl,SIX
LD AND
R2,NUMBER R3,R3,#0
Clear R3. It will contain the product.
Rl keeps track of the iterations
OD
OE
OF
10
11
12
13 SIX .FILL x0006 14
15 .END
NUMBER
HALT
.BLJ-..'W
ADD BRp
1
Figure 7.1 An assembly language program
R3,R3,R2 Rl,Rl,#-1 AGAIN
OE, and 10) specify assembly language instructions to be translated into machine language instructions of the LC-3, which will actually be carried out when the program runs. The remaining four lines (05, 12, 13, and 15) contain pseudo-ops, which are messages from the programmer to the translation program to help in the translation process. The translation program is called an assembler (in this case the LC-3 assembler), and the translation process is called assembly.
7.2.1 Instructions
Instead of an instruction being 16 Os and ls, as is the case in the LC-3 ISA, an
instruction in assembly language consists of four parts, as follows: LABEL OPCODE OPERANDS ; COMMENTS
Two of the parts (LABEL and COMMENTS) are optional. More on that momentarily.
Opcodes and Operands
Two of the parts (OPCODE and OPERANDS) are mandatory. An instruction must have an OPCODE (the thing the instruction is to do), and the appropriate number of OPERANDS (the things it is supposed to do itto). Not surprisingly, this was exactly what we encountered in Chapter 5 when we studied the LC-3 ISA.
The OPCODE is a symbolic name for the opcode of the corresponding LC-3 instruction. The idea is that it is easier to remember an operation by the symbolic
7.2 An Assembly Language Program
179
180
chapter 7 Assembly Language
name ADD, AND, or LDR than by the 4-bit quantity 0001, 0101, or 0110. Figure 5.3 (also Figure A.2) lists the OPCODES of the 15 LC-3 instructions. Pages 526 through 541 show the assembly language representations for the 15 LC-3 instructions.
The number of operands depends on the operation being performed. For example, the ADD instruction (line 0C) requires three operands (two sources to obtain the numbers to be added, and one destination to designate where the result is to be placed). All three operands must be explicitly identified in the instruction.
AGAIN ADD R3,R3,R2
The operands to be added are obtained from register 2 and from register 3. The result is to be placed in register 3. We represent each of the registers 0 through 7 as RO, Rl, R2, ... , R7.
The LD instruction (line 07) requires two operands (the memory location from which the value is to be read and the destination register that is to contain the value after the instruction completes its execution). We will see momentarily that memory locations will be given symbolic addresses called labels. In this case, the location from which the value is to be read is given the label NUMBER. The destination into which the value is to be loaded is register 2.
LD R 2 , NUMBER
As we discussed in Section 5.1.6, operands can be obtained from registers, from memory, or they may be literal (i.e., immediate) values in the instruction. In the case of register operands, the registers are explicitly represented (such as R2 and R3 in line 0C). In the case of memory operands, the symbolic name of the memory location is explicitly represented (such as NUMBER in line 07 and SIX in line 06). In the case of immediate operands, the actual value is explicitly represented (such as the value 0 in line 08).
AND R3, R3, #0 ; Clear R3. It will contain the product.
A literal value must contain a symbol identifying the representation base of the number. We use # for decimal, x for hexadecimal, and b for binary. Sometimes there is no ambiguity, such as in the case 3FOA, which is a hex number. Nonethe- less, we write it as x3F0A. Sometimes there is ambiguity, such as in the case 1000. xl000 represents the decimal number 4096, blO00 represents the decimal number 8, and #1000 represents the decimal number 1000.
Labels
Labels are symbolic names that are used to identify memory locations that are referred to explicitly in the program. In LC-3 assembly language, a label consists of from one to 20 alphanumeric characters (i.e., a capital or lowercase letter of the alphabet, or a decimal digit), starting with a letter ofthe alphabet. NOW1 Under21, R2D2, and C3PO are all examples of possible LC-3 assembly language labels.
There are two reasons for explicitly referring to a memory location.
1. The location contains the target of a branch instruction (for example, AGAIN in line 0C).
7.2 An Assembly Language Program 181 2. The location contains a value that is loaded or stored (for example,
NUMBER, line 12, and SIX, line 13).
The location AGAIN is specifically referenced by the branch instruction in line OE.
BRp AGAIN
If the result of ADD RI ,R 1,#-1 is positive (as evidenced by the P condition code being set), then the program branches to the location explicitly referenced as AGAIN to perform another iteration.
The location NUMBER is specifically referenced by the load instruction in line 07. The value stored in the memory location explicitly referenced as NUMBER is loaded into R2.
If a location in the program is not explicitly referenced, then there is no need to give it a label.
Comments
Comments are messages intended only for human consumption. They have no effect on the translation process and indeed are not acted on by the LC-3 assembler. They are identified in the program by semicolons. A semicolon signifies that the rest of the line is a comment and is to be ignored by the assembler. If the semicolon is the first nonhlank character on the line, the entire line is ignored. If the semicolon follows the operands of an instruction, then only the comment is ignored hy the a.5scmblcr.
The purpose of comments is to make the program more comprehensible to the human reader. They help explain a nonintuitive aspect of an instruction or a set of instructions. In lines 08 and 09, the comment "Clear R3; it will contain the product" lets the reader know that the instruction on line 08 is initializing R3 prior to accumulating the product of the two numbers. While the purpose of line 08 may be obvious to the programmer today, it may not be the case two years from now, after the programmer has written an additional 30,000 lines of code and cannot remember why he/she wrote AND R3,R3,#0. It may also be the case that two years from now, the programmer no longer works for the company and the company needs to modify the program in response to a product update. I f the task is assigned to someone who has never seen the code before, comments go a long way toward improving comprehension. ·
It is important to make comments that provide additional insight and not just restate the obvious. There are two reasons for this. First, comments that restate the obvious are a waste of everyone's time. Second, they tend to obscure the comments that say something important because they add clutter to the program. For example, in line OD, the comment "Decrement RI" would be a bad idea. It would provide no additional insight to the instruction, and it would add clutter to the page.
Another purpose of comments, and also the judicious use of extra blank spaces to a line, is to make the visual presentation of a program easier to understand. So, for example, comments are used to separate pieces of the program from each other to make the program more readable. That is, lines of code that work together to
182
chapter 7 Assembly Language
compute a single result are placed on successive lines, while pieces of a program that produce separate results are separated from each other. For example, note that lines 0C through OE are separated from the rest of the code by lines OB and OF. There is nothing on lines OB and OF other than the semicolons.
Extra spaces that are ignored by the assembler provide an opportunity to align elements of a program for easier readability. For example, all the opcodes start in the same column on the page.
7.2.2 Pseudo-ops (Assembler Directives)
The LC-3 assembler is a program that takes as input a string of characters repre- senting a computer program written in LC-3 assembly language and translates it into a program in the ISA of the LC-3. Pseudo-ops are helpful to the assembler in performing that task.
Actually, a more formal name for a pseudo-op is assembler directive. They are called pseudo-ops because they do not refer to operations that will be performed by the program during execution. Rather, the pseudo-op is strictly a message to the assembler to help the assembler in the assembly process. Once the assembler handles the message, the pseudo-op is discarded. The LC-3 assembler contains five pseudo-ops: .ORIG, .FILL, .BLKW, .STRINGZ, and .END. All are easily recognizable by the dot as their first character.
.ORIG
.ORIG tells the assembler where in memory to place the LC-3 program. In line 05, .ORIG x3050 says, start with location x3050. As a result, the LD Rl ,SIX instruction will be put in location x3050.
.FILL
.FILL tells the assembler to set aside the next location in the program and initialize it with the value of the operand. In line 13, the ninth location in the resultant LC-3 program is initialized to the value x0006.
.BLKW
.BLKW tells the assembler to set aside some number of sequential memory loca- tions (i.e., a BLocK of Words) in the program. The actual number is the operand of the .BLKW pseudo-op. In line 12, the pseudo-op instructs the assembler to set aside one location in memory (and also to label it NUMBER, incidentally).
The pseudo-op .BLKW is particularly useful when the actual value of the operand is not yet known. For example, one might want to set aside a location in memory for storing a character input from a keyboard. It will not be until the program is run that we will know the identity of that keystroke.
.STRINGZ
.STRINGZ tells the assembler to initialize a sequence of n + I memory locations. The argument is a sequence of n characters, inside double quotation marks. The
first n words of memory are initialized with the zero-extended ASCII codes of the corresponding characters in the string. The final word of memory is initialized to 0. The last character, xOOOO, provides a convenient sentinel for processing the string of ASCII codes.
For example, the code fragment
.ORIG x301C
HELLO .STRINGZ 11 Hello, World!"
would result in the assembler initializing locations x3010 through x301D to the following values:
x3010: x0048 x3011: x0065 x3012: x006C x3013: x006C x3014: x006F x3015: x002C x3016: x0020 x3017: x0057 x3018: x006F x3019: x0072 x301A: x006C x301B: x0064 x301C: x0021 x301D: xoooo
.END
.END tells the assembler where the program ends. Any characters that come after .END will not be used by the assembler. Note: .END does not stop the program during execution. In fact, .END does not even exist at the time of execution. It is simply a delimiter-it marks the end of the source program.
7.2.3 Example: The Character Count Example of Section 5.5, Revisited
Now we are ready for a complete example. Let's consider again the problem of Section 5.5. We wish to write a program that will take a character that is input from the keyboard and a file and count the number of occurrences of that char- acter in that file. As before, we first develop the algorithm by constructing the flowchart. Recall that in Section 6.1, we showed how to decompose the problem systematically so as to generate the flowchart of Figure 5.16. In fact, the final step of that process in Chapter 6 is the flowchart of Figure 6.3e, which is essentially identical to Figure 5. I6. Next, we use the flowchart to write the actual program. This time, however, we enjoy the luxury of not worrying about Os and ls and instead write the program in LC-3 assembly language. The program is shown in Figure 7.2.
7.2 An Assembly Language Program 183
184
chapter 7
Assembly Language
01
02
03
04
05
06
07
08 Initialization 09
Program to count occurrences of a character in a file. Character to be input from the keyboard.
Result to be displayed on the monitor.
Program works only if no more than 9 occurrences are found.
.ORIG x3000 AND R2,R2,#0 LO R3,PTR TRAP x23
LOR Rl,R3,#0
Test character for end of file TEST ADD R4,Rl,#-4
OA OB oc OD OE OF 10 11 13 14 l.S 16 17 18 19 lA 1B lC 1D lE 1F 20 21 22 23 2'1 25
26 OUTPUT LD RO,ASCil
27 ADD RO,RO,R2
28 TRAP x21
29 TR~P x25
2.A
R2 is counter, initialize to 0 R3 is pointer to characters
RO gets character input
Rl qets the next character
Test for EOT
If done, prepare t~e output
BRz omμur
Test character for match. If a match, increment count.
GETCHAR ADD LDR
R3,R3,#l R1,R3,#0 TEST
Increment the pointer
Rl gets the next character to test
Load the ASCII template Convert binary to ASCII
ASCII code in RO is displayed Halt machine
NOT Rl,Rl ADD Rl,Rl,RO NOT Rl,Rl BRnp GETCHA.c'q ADD R2,R2,#l
If match, Rl = xFFFF
If match, Rl ~ xOOOO
If no mat_ctJ, do not increment
Get next character from the file
BRnzp Output the count.
2B Storage for pointer and ASCII template 2C
2D ASCII .FILL x0030
2E PTR
2F
Figure 7.2
.FILL x4000 .END
The assembly language program to count occurrences of a character
A few notes regarding this program:
Three times during this program, assistance in the form of a service call is required of the operating system. In each case, a TRAP instruction is used. TRAP x23 causes a character to be input from the keyboard and placed in RO (line OD). TRAP x21 causes the ASCII code in RO to be displayed on the monitor (line 28). TRAP x25 causes the machine to be halted (line 29). As we said before, we will leave the details of how the TRAP instruction is carried out until Chapter 9,
The ASCII codes for the decimal digits Oto 9 (0000 to 1001) are x30 to x39. The conversion from binary to ASCII is done simply by adding x30 to the binary value of the decimal digit. Line 2D shows the label ASCII used to identify the memory location containing x0030.
The file that is tu be examined starts at address x4000 (see line 2E). Usually, this starting address would not be known to the programmer who is writing this program since we would want the program to work on files that will become available in the future. That situation will be discussed in Section 7.4.
7.3 Thenssembl~Process
7.3.1 Introduction
Before an LC-3 assembly language program can be executed, it must first be translated into a machine language program, that is, one in which each instruction is in the LC-3 ISA. It is the job of the LC-3 assembler to perform that translation.
If you have available an LC-3 assembler, you can cause it to translate your assembly language program into a machine language program by executing an appropriate command. In the LC-3 assembler that is generally available via the Web, that command is assemble and requires as an argument the filename of your assembly language program. For example, if the filename is solution1.asm, then
assemble solutionl.asm out.file
produces the file outfile, which is in the ISA of the LC-3. It is necessary to check with your instructor for the correct command line to cause the LC-3 assembler to produce a file of Os and ls in the ISA of the LC-3.
7.3.2 A Two-Pass Process
In this section, we will see how the assembler goes through the process of trans- lating an assembly language program into a machine language program. We will use as our input to the process the assembly language program of Figure 7.2.
You remember that there is in general a one-tu-one correspondence between instructions in an assembly language program and instructions in the final machine language program. We could try tu perform this translation in one pass through the assembly language program. Starting from the top of Figure 7.2, the assembler discards lines 01 to 09, since they contain only comments. Comments are strictly for human consumption; they have no bearing on the translation process. The assembler then moves on to line OA. Line OA is a pseudo-op; it tells the assembler that the machine language program is to start at location x3000. The assembler then moves on to line OB, which it can easily translate into LC-3 machine code.
At this point, we have
x3000: 0101010010100000
The LC-3 assembler moves on to translate the next instruction (line OC). Unfor- tunately, it is unable to do so since it does not know the meaning of the symbolic address PTR. At this point the assembler is stuck, and the assembly process fails.
7.3 The Assembly Process 185
18& chapter 7 Assembly Language
To prevent this from occurring, the assembly process is done in two complete passes (from beginning to .END) through the entire assembly language program. The objective of the first pass is to identify the actual binary addresses correspond- ing to the symbolic names (or labels). This set of correspondences is known as the symbol table. In pass 1, we construct the symbol table. In pass 2, we translate the individual assembly language instructions into their corresponding machine language instructions.
Thus, when the assembler examines line 0C for the purpose of translating LD R3,PTR
during the second pass, it already knows the correspondence between PTR and x3013 (from the first pass). Thus it can easily translate line 0C to
X]OOl: 0010011000010001
The problem of not knowing the I6-bit address corresponding to PTR no longer exists.
7.3.3 The First Pass: Creating the Symbol Table
For our purposes, the symbol table is simply a correspondence of symbolic names with their 16-bit memory addresses. We obtain these correspondences by passing through the assembly language program once, noting which instruction is assigned to which address, and identifying each label with the address of its assigned entry.
Recall that we provide labels in those cases where we have to refer to a loca- tion, either because it is the target of a branch instruction or because it contains data that must be loaded or stored. Consequently, if we have not made any pro- gramming mistakes, and if we identify all the labels, we will have identified all the symbolic addresses used in the program.
The preceding paragraph assumes that our entire program exists between our .ORIG and .END pseudo-ops. This is true for the assembly language program of Figure 7.2. In Section 7.4, we will consider programs that consist of multiple parts, each with its own .ORIG and .END, wherein each part is assembled separately.
The first pass starts, after discarding the comments on lines 01 to 09, by noting (line 0A) that the first instruction will be assigned to address x3000. We keep (rack of the location assigned to each instruction by means of a location counter (LC). The LC is initialized to the address specified in .ORIG, that is, x3000.
The assembler examines each instruction in sequence and increments the LC once for each assembly language instruction. If the instruction examined contains a label, a symbol table entry is made for that label, specifying ilie current contents of LC as its address. The first pass terminates when the .END instruction is encountered.
The first instruction that has a label is at line 13. Since it is the fifth instruction in the program and since the LC at that point contains x3004, a symbol table entry is constructed thus:
Symbol IAddress I TEST I x3004 I
7.3 The Assembly Process 187 The second instruction that has a label is at line 20. At this point, the LC has been
incremented to x300B. Thus a symbol table entry is constructed, as follows:
I Symbol IAddress I
IGETCHAR I x300B I
At the conclusion of the first pass, the symbol table has the following entries:
I Symbol IAddress I
7.3.4 The Second Pass: Generating the Machine Language Program
The second pass consists of going through the assembly language program a second time, line by line, this time with the help of the symbol table. At each line, the assembly language instruction is translated into an LC-3 machine language instruction.
Starting again at the top, the assembler again discards lines 0 I through 09 because they contain only comments. Linc 0A is the .ORIG pseudo-op, which the assembler uses to initialize LC to x3000. The assembler moves on to line OB and produces the machine language instruction 0 I01010010100000. Then the assembler moves on to line OC.
This time, when the assembler gets to line OC, it can completely assemble the instruction since it knows that PTR corresponds to x3013. The instruction is LD, which has an opcode encoding of 0010. The destination register (DR) is R3, that is, 011.
PCoffset is computed as follows: We know that PTR is the label for address x3013, and that the incremented PC is LC+I, in this case x3002. Since PTR (x3013) must be the sum of the incremented PC (x3002) and the sign-extended PCoffset, PCoffset must be x00l l. Putting this all together, x3001 is set to 0010011000010001, and the LC is incremented to x3002.
Note: In order to use the LD instruction, it is necessary that the source of the load, in this case the address whose label is PTR, is not more than + 256 or -255 memory locations from the LD instruction itself. If the address of PTR had been greater than LC+ I +255 or less than LC+ I -256, then the offset would not fit in bits [8:0] ofthe instruction. In such a case, an assembly error would have occurred, preventing the assembly process from finishing successfully. Fortunately, PTR is close enough to the LD instruction, so the instruction assembled correctly.
The second pass continues. At each step, the LC is incremented and the location specified by LC is assigned the translated LC-3 instruction or, in the case of .FILL, the value specified. When the second pass encounters the .END instruction, assembly terminates.
TEST
GETCHAR
OUTPUT
ASCII x3012 PTR x3013
The resulting translated program is shown in Figure 7.3.
x3004 x300B x300E
188
chapter 7
Assembly Language
Figure 7.3
The machine language program for the assembly language program of Figure 7.2
IAddress I
x3000 x3001 x3002 x3003 x3004 x3005 x3006 x3007 x3008
x3009 x300A x300B x300C x300D x300E x300F x3010 x3011 x3012 x3013
Binary
0011000000000000 0101010010100000 0010011000010001 1111000000I00011 0110001011000000 0001100001111100 0000010000001000 1001001001111111 0001001001000000
1001001001111111 0000101000000001 0001010010100001 0001011011100001 0110001011000000 0000111111110110 00I0000000000011 0001000000000010
11 11000000 I0000 I 111100000010010 I 0000000000110000 0100000000000000
That process was, on a good day, merely tedious. Fortunately, you do not have to do it for a living-the LC-3 assembler docs that. And, since you now know LC-3 assembly language, there is no need to program in machine language. Now we can write our programs symbolically in LC-3 assembly language and invoke the LC-3 assembler to create the machine language versions that can execute on an LC-3 computer.
7.4 BeqondtheRssemblqofaSingleRssemblq~anguageProgram
Our purpose in this chapter has been to take you up one more notch from the ISA of the computer and introduce assembly language. Although it is still quite a large step from C or C++, assembly language does, in fact, save us a good deal of pain. We have also shown how a rudimentary two-pass assembler actually works to translate an assembly language program into the machine language of the LC-3 ISA.
There are many more aspect5 to sophisticated assembly language program- ming that go well beyond an introductory course. However, our reason for teaching assembly language is not to deal with its sophistication, but rather to show its innate simplicity. Before we leave this chapter, however, there are a few additional highlights we should explore.
7.4 Beyond the Assembly of a Single Assembly Language Program 7.4.1 The Executable Image
When a computer begins execution of a program, the entity being executed is called an executable image. The executable image is created from modules often created independently by several different programmers. Each module is trans- lated separately into an object file. We have just gone through the process of performing that translation ourselves by mimicking the LC-3 assembler. Other modules, some written in C perhaps, are translated by the C compiler. Some mod- ules are written by users, and some modules are supplied as library routines by the operating system. Each object file consists of instructions in the ISA of the computer being used, along with its associated data. The final step is to link all the object modules together into one executable image. During execution of the program, the FETCH, DECODE, ... instruction cycle is applied to instructions
in the executable image.
7.4.2 More than One Object File
It is very common to form an executable image from more than one object file. In fact, in the real world, where most programs invoke libraries provided by the operating system as well as modules generated by other programmers, it is much more common to have multiple object files than a single one.
A case in point is our example character count program. The program counts the number of occurrences of a character in a file. A typical application could easily have the program as one module and the input data file as another. If this were the case, then the starting address of the file, shown as x4000 in line 2E of Figure 7.2, would not be known when the program was written. Ifwe replace line 2E with
PTR .FILL STARTofFILE
then the program of Figure 7.2 will not assemble because there will be no symbol table entry for STARTofFILE. What can we do?
If the LC-3 assembly language, on the other hand, contained the pseudo-op .EXTERNAL, we could identify STARTofFILE as the symbolic name of an address that is not known at the time the program of Figure 7.2 is assembled. This would be done by the following line
.EXTERNAL STARTofFILE,
which would send a message to the LC-3 assembler that the absence of label STARTofFILE is not an error in the program. Rather, STARTofFILE is a label in some other module that will be translated independently. In fact, in our case, it will be the label of the location of the first character in the file to be examined by our character count program.
If the LC-3 assembly language had the pseudo-op .EXTERNAL, and if we had designated STARTofFILE as .EXTERNAL, the LC-3 would be able to create a symbol table entry for STARTotFILE, and instead of assigning it an address, it would mark the symbol as belonging to another module. At link time, when all the modules are combined, the linker (the program that manages the "combining"
189
190
chapter 7 Assembly Language
process) would use the symbol table entry for STARTofFILE in another module to complete the translation of our revised line 2E.
In this way, the .EXTERNAL pseudo-op allows references by one module to symbolic locations in another module without a problem. The proper translations arc resolved by the linker.
7.1 An assembly language program contains the following two instructions. The assembler puts the translated version of the LDI instruction that follows into location x3025 of the object module. After assembly is complete, what is in location x3025?
PLACE .FILL x45A7
LDI R3, PLACE
7.2 An LC-3 assembly language program contains the instruction: ASCII LD Rl, ASCII
The symbol table entry for ASCII is x4F08. If this instruction is executed during the running of the program, what will be contained in RI immediately after the instruction is executed?
7.3 What is the problem with using the string AND as a label?
7.4 Create the symbol table entries generated by the assembler when translating the following routine into machine code:
.ORIG ST
ST AND
TEST IN BRz
ADD BRn ADD NOT BRn HALT
FINISH ADD HALT
SAVE3 .FILL SAVE2 .FILL
.END
x301C
R3, SAVE3 R2, SAVE2 R2, R2, #0
TEST
Rl, RO, #-10 FINISH
Rl, RO, #-15 Rl, Rl FINISH
R2, R2, #1
XOOOO xoooo
7.5 a.
What does the following program do?
7.6
b. What value will be contained in RESULT after the program runs to completion?
Our assembler has crashed and we need your help! Create a symbol table and assemble the instructions at labels D, E, and F for the program below.
You may assume another module deposits a positive value into A before this module executes.
7.7
In no more than 15 words, what does the above program do?
Write an LC-3 assembly language program that counts the number of ls in the value stored in RO and stores the result into Rl. For example, if RO contains 000 I00 I IO I I 10000, then after the program executes, the result stored in Rl would be 0000 0000 0000 0110.
LOOP
DONE
.ORIG x3000
LD R2, ZERO LD RO, MO
LD Rl, Ml
BRz DONE
ADD R2, R2, RO
ADD Rl, Rl, -1 BR LOOP
ST R2, RESULT HALT
RESULT .FILL ZERO .FILL MO .FILL Ml .FILL
.END
xOOOO xoooo x0004 x0803
.ORIG AND
D LD AND
BRp E ADD B ADD ADD F BRp
ST
TRAP A .BLKW C .BLKW
.END
x3000
RO, RO, # 0 Rl, A
R2, Rl, #1 B
Rl, Rl, #-1 RO, RO, R l Rl, Rl, #-2 B
RO, C
x25
1
1
Exercises 191
'' f'
r.
I J
j
7.9 7.10
Show the contents of the register file (in hexadecimal) when the breakpoint is encountered.
What is the purpose of the . END pseudo-op? How does it differ from the HALT instruction?
The following program fragment has an error in it. Identify the error and explain how to fix it.
i
·l'
I
I
I
i !
7.11
Will this error be detected when this code is assembled or when this code is run on the LC-3?
The LC-3 assembler must be able to convert constants represented in ASCII into their appropriate binary values. For instance, x2A translates into 00101010 and #12 translates into 00001100. Write an LC-3 assembly language program that reads a decimal or hexadecimal constant from the keyboard (i.e., it is preceded by a# character signifying it is a decimal, or x signifying it is hex) and prints out the binary representation. Assume the constants can be expressed with no more than two decimal or hex digits.
r
192
chapter 7 Assembly Language
7.8
An engineer is in the process of debugging a program she has written. She is looking at the following segment of the program, and decides to place a breakpoint in memory at location OxA404. Starting with the PC = 0xA400, she initializes all the registers to zero and runs the program until the breakpoint is encountered.
Code Segment:
OxA400 THISl
OxA401 THIS2
OxA402 THIS3
OxA403 THIS4
OxA404 THISS
LEA RO,
LD Rl,
LDI R2,
LDR R3, RO, # 2
.FILL xA400
ADD ST HALT
A . FILL
R3, R3, #30 R3, A
#0
THISl THIS2 THISS
7.12
What does the following LC-3 program do?
7.13
The following program adds the values stored in memory locations A, B, and C, and stores the result into memory. There are two errors in the code. For each, describe the error and indicate whether it will be detected at assembly time or at run time.
Line No. 1
2
3
.ORIG x3000 LD RO, A
ADD Rl, Rl, RO
AG
NO B A
.ORIG x3000
AND RS, R5, #0 AND R3, R3, #0 ADD R3, R3, #8 LDI Rl, A
ADD R2, Rl, #0 ADD R2, R2, R2 ADD R3, R3, #-1 BRnp AG
LD R4, B
AND Rl, Rl, R4 NOT Rl, Rl
ADD Rl, Rl, #1 ADD R2, R2, Rl BRnp NO
ADD RS, RS, #1 HALT
.FILL xFFOO .FILL x4000 .END
ONE
4 TWO LDRO,B
5 6 7 8 9 10 11 12 13 14
THREE
A B C D
ADD Rl, Rl, RO LD RO, C
ADD Rl, Rl, RO ST Rl, SUM
TRAP
. FILL .FILL .FILL .FILL .END
x25 xOOOl x0002 x0D03 x0004
Exercises 193
194 chapter 7
Assembly Language
7.14
a.
b.
c.
Assemble the following program:
7.15
The following is an LC-3 program that performs a function. Assume a sequence of integers is stored in consecutive memory locations, one integer per memory location, starting at the location x4000. The sequence terminates with the value xOOOO. What does the following program do?
LABEL
.ORIG x3000
STI RO, LABEL OUT
HALT
.STRINGZ 11 %11 .END
The programmer intended the program to output a% to the monitor, and then halt. Unfortunately, the programmer got confused about the semantics of each of the opcodes (that is, exactly what function is carried out by the LC-3 in response to each opcode). Replace exactly one opcode in this program with the correct opcode to make the program work as intended.
The original program from part a was executed. However, execution exhibited some very strange behavior. The strange behavior was in part due to the programming error, and in part due to the fact that the value in RO when the program started executing was x3000. Explain what the strange behavior was and why the program behaved that way.
.ORIG LD
LD
LOOP LDR BRz AND BRz
BRnzp Ll ADD STR NEXT ADD
BRnzp DONE HALT
NUMBERS .FILL MASK .FILL
.END
x3000
RO, NUMBERS R 2 , MASK Rl, RO, #0 DONE
RS, Rl, R2 Ll
NEXT
Rl, Rl, Rl Rl, RO, #0
RO, RO, LOOP
x4000 x8000
# 1
7.16
Assume a sequence of nonnegative integers is stored in consecutive memory locations, one integer per memory location, starting at location x4000. Each integer has a value between 0 and 30,000 (decimal). The sequence terminates with the value -1 (i.e., xFFFF).
What does the following program do?
.ORIG x3000
AND R4, R4, #0 AND R3, R3, #0 LD RO, NUMBERS
7.17
7.18
Suppose you write two separate assembly language modules that you expect to be combined by the linker. Each module uses the label AGAIN, and neither module contains the pseudo-op . EXTERNAL AGAIN. Is there a problem using the label AGAIN in both modules? Why or why not?
The following LC-3 program compares two character strings of the same length. The source strings are in the . STRINGZ form. The first string starts at memory location x4000, and the second string starts at memory location x4 l 00. If the strings are the same, the program terminates with the value 0 in RS. Insert instructions at (a), (b), and (c) that will complete the program.
LOOP
LOOP LDR
Ll ADD NEXT ADD
R3, R3, #1
DONE TRAP NUMBERS .FILL
.END
x25 x4000
NEXT
ADD BRz AND BRnzp J\ND ADD TRAP
R3, R3, R4 LOOP
RS, RS, #0 DONE
RS, RS, #0 RS, RS, #1 x25
x4000 x4100
Rl, RO, #0 NOT R2, Rl
BRz DONE
AND R2, Rl, #1 BRz Ll
ADD R4, R4, #1 BRnzp NEXT
RO, RO, #1 BRnzp LOOP
DONE
FIRST .FILL SECOND .FILL
x3000
Rl, FIRST R 2 , SECOND RO, RO, #0
. ORIG
LD
LD
AND
-------------- (a) LDR R4, R2, #0 BRz NEXT
ADD Rl, Rl, #1 ADD R2, R2, #1
-------------- (b) -------------- (c)
.END
Exercises 195
196
chapter 7 Assembly Language
7.19
When the following LC-3 program is executed, how many
times will the instruction at the memory address labeled LOOP execute?
7.20
LC-3 assembly language modules (a) and (b) have been
written by different programmers to store x00l 5 into memory location x4000. What is fundamentally different about their approaches?
7.21
Assemble the following LC-3 assembly language program.
.ORIG x3000
AND RO, RO, # 0 ADD R2, RO, #10 LD Rl, MASK
LD R3, PTRl
7.22
What does the program do (in no more than 20 words)?
The LC-3 assembler must be
able to map an instruction's mnemonic opcode into its binary opcode. For
instance, given an ADD, it must generate the binary pattern 000 I. Write an LC-3 assembly language program that prompts the user to type in
a.
b.
.ORIG x5000 AND RO, RO, ADD RO, RO, ADD RO, RO, STI RO, PTR HALT
PTR .FILL x4000 .END
.ORIG x4000 .FILL x0015 .END
# 0 # 1 5 # 6
LOOP LDR
AND R4,
HALT MASK PTRl PTR2
BRp LOOP
STI RO, PTR2
.FILL xB0OO .FILL x4000 .FILL x5000 .END
.ORIG x3005
LEA R2, DATA LDR R4, R2, #0
LOOP ADD
BRzp LOOPk
DATA .FILL x000B .END
R4, R4, #-3
R4, R3, #0 R4, Rl
BRz NEXT
ADD RO, RO, # 1
NEXT ADD
ADD R2, R2, #-1
R3, R3, #1
TRAP x25
Exercises 197 an LC-3 assembly language opcode and then displays its binary opcode.
If the assembly language opcode is invalid, it displays an error message.
7.23 The following LC-3 program determines whether a character
string is a palindrome or not. A palindrome is a string that reads the same backwards as forwards. For example, the string "racecar" is a palindrome. Suppose a string starts at memory location x4000, and is in the
. STRINGZ format. If the string is a palindrome, the program terminates with the value 1 in RS. If not, the program terminates with the value
0 in RS. Insert instructions at (a)---(e) that will complete the program.
AGAIN
. ORIG LD ADD LDR
B R z ADD BRnzp
x3000
RO, PTR Rl, RO, #0 R21 Rl, #0 CONT
Rl, Rl, #1 AGAIN
CONT --------------(al
LOOP
LDR R 3 , RO, # 0
-------------- (b)
NO
LOOP
.ORIG x3000
AND R2, R2, #0 ADD R2, R2, #4 BRz DONE
ADD R2, R21 #-1 ADD R3, R3, R3 BR LOOP
NOT R 4 , ADD R4, ADD R3, BRnp NO
R4
R4, #1 R3, R4
-------------- (c) - - - - - - - - - - - - - - (d)
NOT R 2 , RO
ADD R21 R2,#1 ADD R2, Rl, R2 BRnz YES
--------------(el AND RS, RS, #0 ADD RS, RS, #1 BRnzp DONE
YES
AND RS, RS, #0 DONE HALT
PTR .FILL x4000 .END
7.24 We want the following program fragment to shift R3 to the left by four bits, but it has an error in it. Identify the error and explain how to fix it.
DONE HALT .END
7.25 What does the pseudo-op . FILL xFF004 do? Why?
I/0
Up to now, we have paid little attention to input/output (1/0). We did note (in Chapter 4) that input/output is an important component of the von Neumann model. There must be a way to get information into the computer in order to process it, and there must be a way to get the result of that processing out of the computer so humans can use it. Figure 4.1 depicts a number of different input and output devices.
We suggested (in Chapter 5) that input and outpqt can be accomplished by executing the TRAP instruction, which asks the operating system to do it for us. Figure 5.17 illustrates this for input (at address x3002) and for output (at address x3010).
In this chapter, we arc ready to do 1/0 by ourselves. We have chosen to study the keyboard as our input device and the monitor display as our output device. Not only are they the simplest 1/0 devices and the ones most familiar to us, but they have characteristics that allow us to study important concepts about 1/0 without getting bogged down in unnecessary detail.
B.l 1/0 Basics
8.1.1 Device Registers
Although we often think of an 1/0 device as a single entity, interaction with a single 1/0 device usually means interacting with more than one device register. The simplest 1/0 devices usually have at least two device registers: one to hold the data being transferred between the device and the computer, and one to indicate
chapte
8
I,j 200
chapter 8 1/0
status information about the device. An example of status information is whether the device is available or is still busy processing the most recent I/O task.
8.1.2 Memory-Mapped 1/0 versus Special Input/Output Instructions
An instruction that interacts with an input or output device register must identify the particular input or output device register with which it is interacting. Two schemes have been used in the past. Some computers use special input and output instructions. Most computers prefer to use the same data movement instructions that are used to move data in and out of memory.
The very old PDP-8 (from Digital Equipment Corporation, light years a g o - 1965) is an example of a computer that used special input and output instructions. The 12-hit PDP-8 instruction contained a 3-bit opcode. If the opcode was 110, an I/O instruction was indicated. The remaining nine hits of the PDP-8 instruction
identified which VO device register and what operation was to be performed. Most computer designers prefer not to specify an additional set ofinstructions for dealing with input and output. They use the same data movement instructions that are used for loading and storing data between memory and the general purpose registers. For example, a load instruction, in which the source address is that of an input device register, is an input instruction. Similarly, a store instruction in which
the destination address is that of an output device register is an output instruction. Since programmers use the same data movement instructions that are used for memory, every input device register and every output device register must be uniquely identified in the same way that memory locations are uniquely identified. Therefore, each device register is assigned an address from the memory address space of the ISA. That is, the VO device registers are mapped to a set of addresses that are allocated to VO device registers rather than to memory locations. Hence
the name memory-mapped l/O.
The original PDP-I I ISA had a 16-bit address space. All addresses wherein
bits [15:13] = 111 were allocated to VO device registers. That is, of the 216 addresses, only 57,344 corresponded to memory locations. The remaining 213 were memory-mapped I/O addresses.
The LC-3 uses memory-mapped I/O. Addresses x0000 to xFDFF are allocated lo memory locations. Addresses xFE00 to xFFFF are reserved for input/output device registers. Table A.3 lists the memory-mapped addresses ofthe LC-3 device registers that have been assigned so far. Future uses and sales of LC-3 micropro- cessors may require the expansion of device register address assignments as new and exciting applications emerge!
8.1.3 Asynchronous versus Synchronous
Most VO is carried out at speeds very much slower than the speed ofthe processor. A typist, typing on a keyboard, loads an input device register with one ASCII code every time he/she types a character. A computer can read the contents of that device register every time it executes a load instruction, where the operand address is the memory-mapped address of that input device register.
Many of today's microprocessors execute instructions under the control of a clock that operates well in excess of 300 MHz. Even for a microprocessor operating at only 300 MHz, a clock cycle lasts only 3.3 nanoseconds. Suppose a processor executed one instruction at a time, and it took the processor 10 clock cycles to execute the instruction that reads the input device register and stores its contents. At that rate, the processor could read the contents of the input device register once every 33 nanoseconds. Unfortunately, people do not type fast enough to keep this processor busy full-time reading characters. Question: How fast would a person have to type to supply input characters to the processor at the maximum rate the processor can receive them? Assume the average word length
is six characters. See Exercise 8.3.
We could mitigate this speed disparity by designing hardware that would
accept typed characters at some slower fixed rate. For example, we could design a piece of hardware that accepts one character every 30 million cycles. This would require a typing speed of I00 words/minute, which is certainly doable. Unfortunately, it would also require that the typist work in lockstep with the computer's clock. That is not acceptable since the typing speed (even of the same typist) varies from moment to moment.
What's the point? The point is that 1/0 devices usually operate at speeds very different from that of a microprocessor, and not in lockstep. This latter characteristic we call asynchronous. Most interaction between a processor and 1/0 is asynchronous. To control processing in an asynchronous world requires some protocol or handshaking mechanism. So it is with our keyboard and monitor display. In the case of the keyboard, we will need a 1-bit status register, called a flag, to indicate if someone has or has not typed a character. In the case of the monitor, we will need a I-bit status register to indicate whether or not the most recent character sent to the monitor has been displayed.
These flags are the simplest form of synchronization. A single flag, called the Ready bit, is enough to synchronize the output of the typist who can type characters at the rate of I00 words/minute with the input to a processor that can accept these characters at the rate of 300 million characters/second. Each time the typist types a character, the Ready bit is set. Each time the computer reads a character, it clears the Ready bit. By examining the Ready bit before reading a character, the computer can tell whether it has already read the last character typed. If the Ready bit is clear, no characters have been typed since the last time the computer read a character, and so no additional read would take place. When the computer detects that the Ready bit is set, it could only have been caused by a new character being typed, so the computer would know to again read a character.
The single Ready bit provides enough handshaking to ensure that the asyn- chronous transfer of information between the typist and the microprocessor can be carried out accurately.
If the typist could type at a constant speed, and we did have a piece of hardware that would accept typed characters at precise intervals (for example, one character every 30 million cycles), then we would not need the Ready bit. The computer would simply know, after 30 million cycles of doing other stuff, that the typist had typed exactly one more character, and the computer would read that character. In this hypothetical situation, the typist would be typing in
,..,..
8.1 1/0 Basics
201
202
chapter 8 1/0
lockstep with the processor, and no additional synchronization would be needed. We would say the computer and typist were operating synchronously, or the input activity was synchronous.
8.1.4 Interrupt-Driven versus Polling
The processor, which is computing, and the typist, who is typing, are two separate entities. Each is doing its own thing. Still, they need to interact, that is, the data that is typed has to get into the computer. The issue of interrupt-driven versus polling is the issue of who controls the interaction. Does the processor do its own thing until being interrupted by an announcement from the keyboard, "Hey, a key ha~ been struck. The ASCII code is in the input device register. You need to read it." This is called interrupt-driven I/O, where the keyboard controls the interaction. Or, does the processor control the interaction, specifically by interrogating (usually, again and again) the Ready bit until it (the processor) detects that the Ready bit is set. At that point, the processor knows it is time to read the device register. This second type of interaction is called polling, since the Ready bit is polled by the processor, asking if any key has been struck.
Section 8.2.2 describes how the polling method works. Section 8.5 explains interrupt-driven UO.
8.2 InputfromtheHe~board
8.2.1 Basic Input Registers (the KBDR and the KBSR)
We have already noted that in order to handle char.icter input from the keyboard, we need two things: a data register that contains the character to be input, and a synchronization mechanism to let the processor know that input has occurred. The synchronization mechanism is contained in the status register associated with the keyboard.
These two registers are called the keyboard data register (KBDR) and the keyboard status register (KBSR). They are assigned addresses from the memory address space. As shown in Table A.3, KBDR is assigned to xFE02; KBSR is assigned to xFE00.
Even though a character needs only eight bits and the synchronization mech- anism needs only one bit, it is easier to assign 16 bits (like all memory addresses in the LC-3) to each. In the case of KBDR, bits [7:0] are used for the data, and bits [15:8] contain x00. In the case of KBSR, bit [15] contains the synchroniza- tion mechanism, that is, the Ready bit. Figure 8.1 shows the two device registers needed by the keyboard.
8.2.2 The Basic Input Service Routine
KBSR[15] controls the synchronization of the slow keyboard and the fa~t pro- cessor. When a key on the keyboard is struck, the ASCII code for that key is loaded into KBDR[7:0] and the electronic circuits associated with the keyboard
Figure 8.1
15 87 0 KBDR
1514 .·..,.· ' .·0 I' KBSR
l
I
MEM.EN, WRITE
OUTPUT -f—–<>I DOR I
I DSR I
I
MOR
. 16
D.MDRI MAR
R.W/WRITE~ + ADDA
.:r Ga1eMDR Y16
/ ‘\ i
, 16
CONTROL LOGIC
MEMORY
LO.DOR
/!,- ~ ~
I+-
2,
~
bit had to be in state 1, indicating that the previous character had already been written to the screen. The LDI and BRzp instructions on lines 01 and 02 perform that test. To do this the LDI reads the output device register DSR, and BRzp tests bit [15]. If the MAR is loaded with xFE04 (the memory-mapped address of the DSR), the address control logic selects DSR as the input to the MDR, where it is subsequently loaded into Rl and the condition codes are set.
8.3.4 Example: Keyboard Echo
When we type at the keyboard, it is helpful to know exactly what characters we have typed. We can get this echo capability easily (without any sophisticated electronics) by simply combining the two routines we have discussed. The key typed at the keyboard is displayed on the monitor.
01 02 03 04 05 06 07 08 09 OA OB
START
ECHO
KBSR KBDR DSR DDR
LDI Rl, KBSR BRzp START LDI RO, KBDR L D I Rl, DSR BRzp ECHO
STI RO, DDR
; Test for character input
Test output register ready
Address of KBSR Address of KBDR Address of DSR Address of DDR
BRnzp NEXT .FILL xFEOO .FILL xFE02 .FILL xFE04 .FILL xFE06
T ASK
8.4 nMoreSophisticatedInputRoutine
In the example of Section 8.2.2, the input routine would be a part of a program being executed by the computer. Presumably, the program requires character input from the keyboard. But how does the person sitting at the keyboard know when to type a character? Sitting there, the person may wonder whether or not the program is actually running, or if perhaps the computer is busy doing something else.
To let the person sitting at the keyboard know that the program is waiting for input from the keyboard, the computer typically prints a message on the monitor. Such a message is often referred to as a prompt. The symbol that is displayed by your operating system (for example, % or C:) or by your editor (for example, 🙂 are examples of prompts.
The program fragment shown in Figure 8.5 obtains keyboard input via polling as we have shown in Section 8.2.2 already. It also includes a prompt to let the person sitting at the keyboard know when it is time to type a key. Let’s examine this program fragment in parts.
You are already familiar with lines 13 through 19 and lines 25 through 28, which correspond to the code in Section 8.3.4 for inputting a character via the
8.4 A More Sophisticated Input Routine 207
208 chapter 8 110
01 START 02
03
04
ST Rl,SaveRl ST R2,SaveR2 ST R3,SaveR3
Save registers needed by this routine
LD R2,Newline LDI R3,DSR BRzp Ll
STI R2,DDR
05
06 Ll
07
08
09
OA
OB Loop oc
OD L2
OE
OF
10
11
12
13 Input 14
15
16 L3
17
18
19
lA L4
1B
lC
lD
lE
lF
20
21
22 SaveRl
23 SaveR2
24 SaveR3
25 DSR .FILL
26 DDR .FILL
27 KBSR .FILL
28 KBDR .FILL
29 Newline .FILL
2A Prompt .STRINGZ ”Input a character>”
LEA Rl,Prompt LDR RO,Rl,#0 BRz Input
LDI R3,DSR BRzp L2
Loop until monitor is ready Move cursor to new clean line
Starting address of prompt string Write the input prompt
End of prompt string
Loop until monitor is ready Write next prompt character
Increment prompt pointer Get next prompt character
Poll until a character is typed Load input character into RO
Loop until monitor is ready Echo input character
STI RO,DDR ADD Rl,Rl,#1 BRnzp Loop
LDI R3,KBSR BRzp Input LDI RO,KBDR LDI R3,DSR BRzp L3
STI RO,DDR
LDI R3,DSR BRzp L4
STI R2,DDR
LD Rl,SaveRl LD R2,SaveR2 LD R3,SaveR3
Loop until monitor Move cursor to new Restore registers to original values
is ready clean line
BRnzp NEXT
.BKLW 1 .BKLW 1 .BKLW 1
T ASK
Do the program’s next task Memory for registers saved
xFE04 xFE06 xFEOO xFE02 xOOOA
; ASCII code for newline Figure 8.5 The input routine for the LC-3 keyboard
keyboard and echoing it on the monitor. Lines 01 through 03, lines ID through lF, and lines 22 through 24 recognize that this input routine needs to use general purpose registers RI, R2, and R3. Unfortunately, they most likely contain values that will still be needed after this routine has finished. To prevent the loss of those values, the ST instructions in lines 0 I through 03 save them in memory locations SaveRI, SaveR2, and SaveR3, before the input routine starts its business. These
three memory locations have been allocated by the .BLKW pseudo-ops in lines 22 through 24. After the input routine is finished and before the program branches unconditionally to its NEXT_TASK (line 20), the LD instructions in lines ID through IF restore the original values saved to their rightful locations in RI, R2, and R3.
This leaves lines 05 through 08, 0A through 11, IA through 1C, 29 and 2A. These lines serve to alert the person sitting at the keyboard that it is time to type a character.
Lines 05 through 08 write the ASCII code x0A to the monitor. This is the ASCII code for a new line. Most ASCII codes correspond to characters that are visible on the screen. A few, like x0A, are control characters. They cause an action to occur. Specifically, the ASCII code x0A causes the cursor to move to the far left of the next line on the screen. Thus the name Newline. Before attempting to write x0A, however, as is always the case, DSR[15] is tested (line 6) to see if DDR can accept a character. IfDSR[15] is clear, the monitor is busy, and the loop (lines 06 and 07) is repeated. When DSR[ 15] is I, the conditional branch (line 7)
is not taken, and x0A is written to DDR for outputting (line 8).
Lines 0A through 11 cause the prompt Input a character> to be written to the screen. The prompt is specified by the .STRINGZ pseudo-op on line 2A and is stored in 19 memory locations-18 ASCII codes, one per memory location, corresponding to the 18 characters in the prompt, and the terminating sentinel
x00OO.
Line 0C iteratively tests to see if the end of the string has been reached (by
detecting x0OOO), and if not, once DOR is free, line OF writes the next character in the input prompt into DOR. When x0000 is detected, the program knows that the entire input prompt has been written to the screen and branches to the code that handles the actual keyboard input (starting at line 13).
After the person at the keyboard has typed a character and it has beeri echoed (lines 13 to 19), the program writes one more new line (lines IA through lC) before branching to its NEXT_TASK.
8.5 Interrupt-Driven1/0
In Section 8.1.4, we noted that interaction between the processor and an 1/0 device can be controlled by the processor (i.e., polling) or it can be controlled by the 1/0 device (i.e., interrupt driven). In Sections 8.2, 8.3, and 8.4, we have studied several examples of polling. In each case, the processor tested the Ready bit of the status register, again and again, and when it was finally I, the processor branched to the instruction that did the input or output operation.
We are now ready to study the case where the interaction is controlled by the 1/0 device.
8.5.1 What Is Interrupt-Driven 1/0?
The essence of interrupt-driven 1/0 is the notion that an 1/0 device that may or may not have anything to do with the program that is running can (1) force that
8.5 Interrupt-Driven l/0 209
:1
program to stop, (2) have the processor carry out the needs of the 1/0 device, and then (3) have the stopped program resume execution as if nothing had happened. These three stages of the instruction execution flow are shown in Figure 8.6.
As far as Program A is concerned, the work carried out and the results com- puted arc no different from what would have been the case if the interrupt had never happened; that is, as ifthe instruction execution flow had been the following:
Program A is executing instruction n Program A is executing instruction n+l Program A is executing instruction n+2 Program A is executing instruction n+3 Program A is executing instruction n+4
8.5.2 Why Have Interrupt-Driven 1/0?
As is undoubtedly clear, polling requires the processor to waste a lot of time spinning its wheels, re-executing again and again the LDT and BR instructions until the Ready bit is set. With interrupt-driven 1/0, none of that testing and branching has to go on. Interrupt-driven 1/0 allows the processor to spend its time doing what is hopefully useful work, executing some other program perhaps, until it is notified that some 1/0 device needs attention.
,,
I
,!
!!
”
‘
Program A is executing instruction n Program A is executing instruction n+l Program A is executing instruction n+2
1: Interrupt signal is detected
1: Program A is put into suspended animation
2: The needs of the I/0 device start being carried out 2: The needs of the I/0 device are being carried out
2: The needs of the I/0 device are being carried out
2: The needs of the I/0 device are being carried out
2: The needs of the I/0 device have been carried out
3: Program A is brought back to life
Program A is executing instruction n+3 Program A is executing instruction n+4
Instruction execution flow for interrupt-driven l/0
210
chapter 8
l/0
Figure 8.6
8.5.3 Generation of the Interrupt Signal
There are two parts to interrupt-driven 1/0, (I) the enabling mechanism that allows an 1/0 device to interrupt the processor when it has input to deliver or is ready to accept output, and (2) the mechanism that manages the transfer of the 1/0 data. The two parts can be briefly described as:
1. generating the interrupt signal, which stops the currently executing process, and
2. handling the request demanded by this signal.
The first part we will study momentarily. We will examine the various things that must come together to force the processor to stop what it is doing and pay attention to the interrupt request.
8.5 Interrupt-Driven I/0
211
Example 8.1
f ‘
I
The second part, unfortunately, we will have to put off until Section I0.2. To handle interrupt requests, the LC-3 uses a stack, and we will not get to stacks until Chapter I0.
Now, then, part 1. Several things must be true for an 1/0 device to actually interrupt the processor:
I. The 1/0 device must want service.
2. The device must have the right to request the service.
3. The device request must be more urgent than what the processor is currently doing.
If all three elements are present, the processor stops executing the program
and takes care of the interrupt.
The Interrupt Signal from the Device
For an 1/0 device to generate an interrupt request, the first two elements in the previous list must be true: The device must want service, and it must have the right to request that service.
The first element we have discussed at length in the study of polling. It is the Ready bit of the KBSR or the DSR. That is, if the 1/0 device is the keyboard, it wants service if someone has typed a character. Ifthe 1/0 device is the monitor, it wants service (i.e., the next character to output) if the associated electronic circuits have successfully completed the display of the last character. In both cases, the 1/0 device wants service when the corresponding Ready bit is set.
The second element is an interrupt enable bit, which can be set or cleared by the processor, depending on whether or not the processor wants to give the 1/0 device the right to request service. In most 1/0 devices, this interrupt enable (IE) bit is part of the device status register. In the KBSR and DSR shown in Figure 8.7, the IE bit is bit [14]. The interrupt request from the 1/0 device is the logical
AND of the IE bit and the Ready bit, as is also shown in Figure 8.7.
If the interrupt enable bit (bit [14]) is clear, it does not matter whether the Ready bit is set; the 1/0 device will not be able to interrupt the processor. In that
case, the program will have to poll the 1/0 device to determine if it is ready.
If bit [14] is set, then interrupt-driven 1/0 is enabled. In that case, as soon as someone types a key (or as soon as the monitor has finished processing the
212 chapter 8 I/0
Figure 8.7
Interrupt enable bits and their use
lti;;:
Interrupt signal to the processor
151413 0
~DSR L:::::::[)-1nterrupt signal to the processor
151413 0
KBSR
8.5 Interrupt-Driven l/0 213 last character), bit [15] is set. This, in turn, asserts the output of the AND gate,
causing an interrupt request to be generated from the VO device. The Importance of Priority
The third element in the list of things that must be true for an VO device to actually interrupt the processor is whether the request is sufficiently urgent. Every instruction that the processor executes, it does with a stated level of urgency. The term we give for the urgency of execution is priority.
We say that a program is being executed at a specified priority level. Almost all computers have a set of priority levels that programs can run at. The LC-3 has eight priority levels, PLO, .. PL7. The higher the number, the more urgent the program. The PL of a program is usually the same as the PL (i.e., urgency) of the request to run that program. If a program is running at one PL, and a higher-level PL request seeks access to the computer, the lower-priority program suspends processing until the higher-PL program executes and satisfies that more urgent request. For example, a computer’s payroll program may run overnight, and at PLO. It has all night to finish-not terribly urgent. A program that corrects for a nuclear plant current surge may run at PL6. We are perfectly happy to let the payroll wait while the nuclear power correction keeps us from being blown to bits.
For our 1/0 device to successfully stop the processor and start an interrupt- driven VO request, the priority of the request must be higher than the priority of the program it wishes to interrupt. For example, we would not normally want to allow a keyboard interrupt from a professor checking e-mail to interrupt the nuclear power correction program.
We will see momentarily that the processor will stop executing its current program and service an interrupt request if the INT signal is asserted. Figure 8.8 shows what is required to assert the INT signal and where the notion of priority level comes into play. Figure 8.8 shows the status registers of several devices operating at various priority levels. Any device that has bits [14] and [15] both set asserts its interrupt request signal. The interrupt request signals are input to a priority encoder, a combinational logic structure that selects the highest priority request from all those asserted. If the PL of that request is higher than the PL of the currently executing program, the INT signal is asserted and the executing program is stopped.
The Test for INT
The final step in the first part ofinterrupt-drivenVO is the test to see ifthe processor should stop and handle an interrupt. Recall from Chapter 4 that the instruc- tion cycle sequences through the six phases of FETCH, DECODE, EVALUATE ADDRESS, FETCH OPERAND, EXECUTE, and STORE RESULT. Recall fur- ther that after the sixth phase, the control unit returns to the first pha~e. that is, the FETCH of the next instruction.
The additional logic to test for the interrupt signal is to replace that last sequen- tial step ofalways going from STORE RESULT back to FETCH, as follows: The STORE RESULT phase is instead accompanied by a test for the interrupt signal INT. If INT is not asserted, then it is business as usual, with the control unit
214
chapter 8 1/0
PLO device
Figure 8.8
Generation of the INT signal
15 14
1
•••
.1
PL1 device
PL7 device
•••
Priority encoder
3
A
? A>B
INT
returning to the FETCH phase to start processing the next instruction. If INT is asserted, then the control unit does two things before returning to the FETCH phase. First it saves enough state information to be able to return to the interrupted program where it left off. Second it loads the PC with the starting address of the program that is to carry out the requirements of the 1/0 device. How it does that is the topic of Section 10.2, which we will study after we learn how stacks work.
8.6 ImplementationofMemorq-Mapped1/0.Revisited
We showed in Figures 8.2 and 8.4 partial implementations of the data path to handle (separately) memory-mapped input and memory-mapped output. We have also learned that in order to support interrupt-driven 1/0, the two status registers must be writeable as well as readable.
Figure 8.9 (reproduced from Figure C.3 of Appendix C) shows the data path necessary to support the full range of features we have discussed for the 1/0 device registers. The Address Control Logic block controls the input or output operation. Note that there are three inputs to this block. MIO.EN indicates whether a data movement from/to memory or 1/0 is to take place this clock cycle. MAR contains the address of the memory location or the memory-mapped address of an 1/0 device register. R.W indicates whether a load or a store is to take place. Depending on the values of these three inputs, the Address Control Logic does nothing (MIO.EN = 0), or provides the control signals to direct the transfer of data between the MDR and the memory or 1/0 registers.
3
B -+—- PL of current program
Figure 8.9
INMUX
Partial data path implementation of memory-mapped I/0
LO.MOR M A R
R MEMORY MEM.EN
LO.MAR
If R.W indicates a load, the transfer is from memory or 1/0 device to the MDR. The Address Control Logic block provides the select lines to INMUX to source the appropriate 1/0 device register or memory (depending on MAR) and also enables the memory if MAR contains the address of a memory location.
I f R. W i n d i c a t e s a s t o r e , t h e c o n t e n t s o f t h e MDR a r e w r i t t e n e i t h e r t o m e m o r y or to one of the device registers. The Address Control Logic either enables a write to memory or it asserts the load enable line of the device register specified by the contents of the MAR.
8.1 a.
b. What is a device data register?
c. What is a device status register?
8.2 Why is a Ready bit not needed if synchronous 1/0 is used?
8.3 In Section 8.1.3, the statement is made that a typist would have trouble supplying keyboard input to a 300-MHz processor at the maximum rate (one character every 33 nanoseconds) that the processor can accept it. Assume an average word (including spaces between words) consists of six characters. How many words/minute would the typist have to type in order to exceed the processor’s ability to handle the input?
8.4 Are the following interactions usually synchronous or asynchronous’!
a. Between a remote control and a television set
b. Between the mailcarrier and you, via a mailbox
c. Between a mouse and your PC
Under what conditions would each of them be synchronous? Under
what conditions would each of them be asynchronous?
What is a device register?
I INPlJT
OUTPUT
OSR
I
KBOR
Exercises
215
Exercises
I
216
chapter 8 1/0
8.5 8.6
8.7
8.8
8.9
8.10
8.11 8.12
What is the purpose of bit [15] in the KBSR?
What problem could occur if a program does not check the Ready bit of the KBSR before reading the KBDR?
Which of the following combinations describe the system described in Section 8.2.2?
a. Memory mapped and interrupt driven
b. Memory mapped and polling
c. Special opcode for 1/0 and interrupt driven
d. Special opcode for 1/0 and polling
Write a program that checks the initial value in memory location x4000 to see if it is a valid ASCII code and if it is a valid ASCII code, prints the
character. If the value in x4000 is not a valid ASCII code, the program prints nothing.
What problem is likely to occur if the keyboard hardware does not check the KBSR before writing to the KBDR?
What problem could occur if the display hardware does not check the DSR before writing to the DDR?
Which is more efficient, interrupt-driven 1/0 or polling? Explain.
Adam H. decided to design a variant of the LC-3 that did not need a keyboard status register. Instead, he created a readable/writable keyboard data and status register (KBDSR), which contains the same data as the KBDR. With the KBDSR, a program requiring keyboard input would wait until a nonzero value appeared in the KBDSR. The nonzero value would be the ASCII value of the last key press. Then the program would write a zero into the KBDSR indicating that it had read the key press.
Modify the basic input service of Section 8.2.2 to implement Adam’s scheme.
Some computer engineering students decided to revise the LC-3 for their senior project. In designing the LC-4, they decided to conserve on device registers by combining the KBSR and the DSR into one status register: the IOSR (the input/output status register). IOSR[l5] is the keyboard device Ready bit and IOSR[l4] is the display device Ready bit. What are the implications for programs wishing to do 1/0? Is this a poor design decision?
An LC-3 Load instruction specifies the address xFE02. How do we know whether to load from the KBDR or from memory location xFE02?
8.13
8.14
8.15
Interrupt-driven 1/0:
a. What does the following LC-3 program do?
8.16
NOTE: RTI will be studied in chapter 10.
c. Finally, suppose the program of part a started executing, and someone sitting at the keyboard struck a key. What would you see on the screen?
What does the following LC-3 program do?
.ORIG x3000
LD R3, A STI R 3 , KBSR LD RO, B TRAP x21 BRnzp AGAIN
A .FILL
B .FILL x0032 KBSR .FILL xFEOO
.END
b. If someone strikes a key, the program will be interrupted and the keyboard interrupt service routine will be executed as shown below. What does the keyboard interrupt service routine do?
AGAIN
.ORIG x3000
LD RO,ASCII LD Rl,NEG LDI R2,DSR BRzp AGAIN
S T I RO,DDR ADD RO,R0,#1 ADD R2,RO,Rl BRnp AGAIN HALT
AGAIN
.ORIG LDI TRAP TRAP RTI
KBDR .FILL .END
xlOOO
RO, KBDR X21
x21
xFE02
ASCII .FILL x0041
NEG .FILL xFFB6 -x004A DSR .FILL xFE04
DDR .FILL xFE06
.END
x4000
Exercises 217
TRRP Routines and Subroutines
9.1 LC- 3TRHP Routines
9.1.1 Introduction
Recall Figure 8.5 ofthe previous chapter. In order to have the program successfully obtain input from the keyboard, it was necessary for the programmer (in Chapter 8) to know several things:
I. The hardware data registers for both the keyboard and the monitor: the monitor so a prompt could be displayed, and the keyboard so the program would know where to look for the input character.
2. The hardware status registers for both the keyboard and the monitor: the monitor so the program would know when it was OK to display the next character in the input prompt, and the keyboard so the program would know when someone had struck a key.
3. The asynchronous nature of keyboard input relative to the executing program.
This is beyond the knowledge of most application programmers. In fact, in
the real world, if application programmers (or user programmers, as they are sometimes called) had to understand 1/0 at this level, there would be much less 1/0 and far fewer programmers in the business.
There is another problem with allowing user programs to perform 1/0 activity by directly accessing KBDR and KBSR. 1/0 activity involves the use of device registers that are shared by many programs. This means that if a user programmer
chapte
9
220
chapter 9
TRAP Routines and Subroutines User Program
x4000 TRAP
Figure 9.1 Invoking an OS service routine by means of the TRAP instruction
were allowed to access the hardware registers, and he/she messed up, it could create havoc for other user programs. Thus, it is ill-advised to give user program- mers access to these registers. We say the hardware registers are privileged and accessible only to programs that have the proper degree ofprivilege.
The notion of privilege introduces a pretty big can of worms. Unfortunately, we cannot do much more than mention it here and leave serious treatment for later. For now, we simply note that there are resources that are not accessible to the user program, and access to those resources is controlled by endowing some programs with sufficient privilege and other programs without. Having said that, we move on to our problem at hand, a “better” solution for user programs that require input and/or output.
The simpler solution as well as the safer solution to the problem of user programs requiring 1/0 involves the TRAP instruction and the operating system. The operating system does have the proper degree of privilege.
We were introduced to the TRAP instruction in Chapter 5. We saw that for certain tasks, a user program could get the operating system to do the job for it by invoking the TRAP instruction. That way, the user programmer does not have to know the gory details previously mentioned, and other user programs are protected from the consequences of inept user programmers.
Figure 9.1 shows a user program that, upon reaching location x4000, needs an 1/0 task performed. The user program requests the operating system to perform the task on behalf of the user program. The operating system takes control of the computer, handles the request specified by the TRAP instruction, and then returns control to the user program, at location x4001. We often refer to the request made by the user program as a service call or a system call.
9.1.2 The TRAP Mechanism
The TRAP mechanism involves several elements, as follows:
1. A set of service routines executed on behalf of user programs by the operating system. These are part of the operating system and start at
Operating system
service routine
for handling the 1/0 request
Figure 9.2
•• •• •• ••
The Trap Vector Table
• • •• ••
x0020
x0021
x0022
X0023
x0024
x0025
x0400
x0430
x0450
x04AO
x04EO
xFD70
arbitrary addresses in memory. The LC-3 was designed so that up to 256 service routines can be specified. Table A.2 in Appendix A contains the LC-3’s current complete list of operating system service routines.
2. A table of the starting addresses of these 256 service routines. This table is stored in memory locations x0000 to x00FF. The table is referred to by various names by various companies. One company calls this table the System Control Block. Another company calls it the Trap Vector Tahle. Figure 9.2 provides a snapshot of the Trap Vector Table of the LC-3, with specific starting addresses highlighted. Among the starting addresses are the one for the character output service routine (location x0430), which is contained in location x002 l, the one for the keyboard input service routine (location x04A0), contained in location x0023, and the one for the machine halt service routine (location xFD70), contained in location x0025.
3. The TRAP instruction. When a user program wishes to have the operating system execute a specific service routine on behalf of the user program, and then return control to the user program, the user program uses the TRAP instruction.
4. A linkage back to the user program. The service routine must have a mechanism for returning control to the user program.
9.1.3 The TRAP Instruction
The TRAP instruction causes the service routine to execute by doing two things:
• It changes the PC to the starting address of the relevant service routine on the basis of its trap vector.
• It provides a way to get back to the program that initiated the TRAP instruction. The “way back” is referred to as a linkage.
The TRAP instruction is specified as follows. The TRAP instruction is made up of two parts: the TRAP opcode 1111 and the trap vector (bits [7:01). Bits [11 :8]
9.1 LC-3 TRAP Routines 221
222
chapter 9 TRAP Routines and Subroutines
must be zero. The trap vector identifies the service routine the user program wants the operating system to perform. In the following example, the trap vector is x23.
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
1. The 8-bit trap vector is zero-extended to 16 bits to form an address, which is loaded into the MAR. For the trap vector x23, that address is x0023, which is the address of an entry in the Trap Vector Table.
2. The Trap Vector Table is in memory locations x0000 to x00FF. The entry at x0023 is read and its contents, in this case x04A0 (see Figure 9.2), are loaded into the MDR.
3. The general purpose register R7 is loaded with the current contents of the PC. This will provide a way back to the user program, as will become clear momentarily.
4. The contents of the MDR are loaded into the PC, completing the instruction cycle.
Since the PC now contains x04A0, processing continues at memory address
x04A0.
Location x04A0 is the starting address of the operating system service routine to input a character from the keyboard. We say the trap vector “points” to the starting address of the TRAP routine. Thus, TRAP x23 causes the operating system to start executing the keyboard input service routine.
In order to return to the instruction following the TRAP instruction in the user program (after the service routine has ended), there must be some mechanism for saving the address ofthe user program’s next instruction. Step 3 ofthe EXECUTE phase listed above provides this linkage. By storing the PC in R7 before loading the PC with the starting address of the service routine, the TRAP instruction provides the service routine with all the information it needs to return control to the user program at the proper location. You know that the PC was already updated (in the FETCH phase of the TRAP instruction) to point to the next instruction. Thus, at the start of execution of the trap service routine, R7 contains the address of the instruction in the user program that follows the TRAP instruction.
9.1.4 The Complete Mechanism
We have shown in detail how the TRAP instruction invokes the service routine to do the user program’s bidding. We have also shown how the TRAP instruc- tion provides the information that the service routine needs to return control to the correct place in the user program. The only thing left is to show the actual instruction in the service routine that returns control to the correct place in the user program. Recall the JMP instruction from Chapter 5. Assume that during the execution of the trap service routine, the contents of R7 was not changed. If
things:
I 1 1I0 0 0 OIO 0 1 0 0 0 1 TRAP trap vector
0 1I
II
The EXECUTE phase of the TRAP instruction’s instruction cycle does four
Figure 9.3
Flow of control from a user program to an OS service routine and back
User program
A
1111 0000 0010 0011
C
9.1 LC-3 TRAP Routines 223 Trap Vector Table
that is the case, control can return to the correct location in the user program by executing JMP R7 as the last instruction in the trap service routine.
Figure 9.3 shows the LC-3 using the TRAP instruction and the JMP instruc- tion to implement the example of Figure 9.1. The flow of control goes from (A) within a user program that needs a character input from the keyboard, to (B) the operating system service routine that performs that task on behalf of the user program, back to the user program (C) that presumably uses the information contained in the input character.
Recall that the computer continually executes its instruction cycle (FETCH, DECODE, etc.). As you know, the way to change the flow of control is to change the contents of the PC during the EXECUTE phase o f the current instruction. In that way, the next FETCH will be at a redirected address.
Thus, to request the character input service routine, we use the TRAP instruc- tion with trap vector x23 in our user program. Execution of that instruction causes the contents of memory location x0023 (which, in this case, contains x04A0) to be loaded into the PC and the address of the instruction following the TRAP instruction to be loaded into R7. The dashed lines on Figure 9.3 show the use of the trap vector to obtain the starting address of the trap service routine from the Trap Vector Table.
The next instruction cycle starts with the FETCH of the contents of x04A0, which is the first instruction of the operating system service routine that requests (and accepts) keyboard input. That service routine, as we will see momentarily, is patterned after the keyboard input routine we studied in Section 8.4. Recall that
xOOOO
xOO;~ 0000 0100 1010 0000
, ,
xOOFF
x04AO
B
Character input
service routine
1100 000 111 000000
•
224
chapter 9 TRAP Routines and Subroutines
Example 9.1
.
.•
.~41e~ 1,,iCl~:,;.>,~
/\ >,
_
;J:.·
f·e::~t!i;.,J 1·’·;.,1%.•’.,;_1;;;,,)\!·it:.·
upon completion of that input routine (see Figure 8.5), RO contains the ASCII code of the key that was typed.
The trap service routine executes to completion, ending with the JMP R7 instruction. Execution of JMP R7 loads the PC with the contents of R7. If R7 was not changed during execution of the service routine, it still contains the address of the instruction following the TRAP instruction in the initiating user program. Thus, the user program resumes execution, with RO containing the ASCII code of the keyboard character that was typed.
The JMP R7 instruction is so convenient for providing a return to the user program that the LC-3 assembly language provides the mnemonic RET for this instruction, as follows:
‘
.•
·.
·
.
f ;’_J{~tt:? -~-~{t- J ; -c:c-{((i:;:~- . ‘r>:
15 14 13 12 11 10 9 8
76543210
I1 1 0 oIooo111110000001 RET
The following program is provided to illustrate the use of the TRAP instruction. It can also be used to amuse the average four-year-old!
,,;~.:e~~g
·
•.•.·.·.·.·’·’·.·:·:·.·.·.·.’•·..’.•.
><2;r. ., ·· ,~;·: ;;;!~r ;~;;-+.-,,.'1?tt\frl~~t~~~1Bfz~tt:11
.:.·:·.·.'.·.·.·.·.·.•.·.•:·.•.·.·.~·.·.~.·.·.·:•.
.
.•.·.·.·.·.·.·.-·.·.·•·.·.·.·.·,·.·:.:.·.·.~.·.·.·.·•.·.·.
·.
·
·
.
.
·
,
-
·····G.:..·.··
,.l~~~~.~~;~-~~~~~~-\1~~,· /1;~~-~~~~~~~,~~~~~~~~~~'f~,~;~~~1~~·1: f(lt},is pse,i Jo ~ J ~ C ~ l l f ~ l a ' @ l i . t ~ ~ b p 3 f 4 ~ ~ ( f ~ ~ ~ ~ ! ~ ~ ; i
'ii.a.· .~· ~a~~;: :-·,-:;:;-,or: ,.,__,, ,'>J< - , ,--"'-· -----u.v:.- >:%>;;£~;,;.:tr<~<5:_:,-:_ ~!:t~::~i~;: ,~"?-_-;<
t Re
The correct operation of the program in this example assumes that the person sitting at the keyboard only types capital letters and the value 7. What if the person types a$? A better solution to Example 9.1 would be a program that tests the character typed to be sure it really is a capital letter from among the 26 capital letters in the alphabet, and if it is not, takes corrective action.
Question: Augment this program to add the test for bad data. That is, write a program that will type the lowercase representation of any capital letter typed and will terminate if anything other than a capital letter is typed. See Exercise 9.6.
9.1.5 TRAP Routines for Handling 1/0
With the constructs just provided, the input routine described in Figure 8.5 can be slightly modified to be the input service routine shown in Figure 9.4. Two changes are needed: (1) We add the appropriate .ORIG and .END pseudo-ops. .ORIG specifics the starting address of the input service routine-the address found at location x0023 in the Trap Vector Table. And (2) we terminate the input service routine with the JMP R7 instruction (mnemonically, RET) rather than the BR NEXT_TASK, as is done on line 20 in Figure 8.5. We use JMP R7 because the service routine is invoked by TRAP x23. It is not part of the user program, as was the case in Figure 8.5.
The output routine of Section 8.3.2 can be modified in a similar way, as shown in Figure 9.5. The results are input (Figure 9.4) and output (Figure 9.5) service routines that can be invoked simply and safely by the TRAP instruction with the appropriate trap vector. In the case of input, upon completion of TRAP x23, RO contains the ASCII code of the keyboard character typed. In the case of output, the initiating program must load RO with the ASCII code of the character it wishes displayed on the monitor and then invoke TRAP x21.
9.1.6 TRAP Routine for Halting the Computer
Recall from Section 4.5 that the RUN latch is ANDed with the crystal oscillator to produce the clock that controls the operation of the computer. We noted that if that 1-bit latch was cleared, the output of the AND gate would be 0, stopping the clock.
Years ago, most ISAs had a HALT instruction for stopping the clock. Given how infrequently that instruction is executed, it seems wasteful to devote an opcode to it. In many modern computers, the RUN latch is cleared by a TRAP
•
9.1 LC-3 TRAP Routines
225
226
chapter 9 TRAP Routines and Subroutines
01
02
03
04 START ST Rl,SaveRl
05 ST
06 ST
07
08 LD R2,Newline 09
DA
OB
DC
OD
OE
lF Loop LDR RO,Rl,#0 10 BRz Input
11 L2 LDI R3,DSR
L l
LDI R3,DSR BRzp Ll
STI R2,DDR
12 BRzp
13 S T I
14
15 ADD
16 BRnzp
17
L2 RO,DDR
Rl, Rl, #1 Loop
Service Routine for Keyboard Input .ORIG x04AO
18 Input LDI R3,KBSR
19 lA lB lC lD lE lF 20 21
BRzp Input
LDI RO,KBDR L3 LDI R3,DSR
22
23
24
25
26
27
28 saveRl 29 SaveR2 2A SaveR3 2B DSR
.BLKW 1 .BLKW l .BLKW 1 .FILL xFE04 .FILL xFE06
L4
BRzp L3
S T I RO,DDR
LDI R3,DSR BRzp L4
STI R2,DDR
LD Rl,SaveRl LD R2,SaveR2 LD R3,SaveR3 RET
2C DDR
2D KBSR .FILL 2E KBDR .FILL 2F
30
31
R2,SaveR2
Save the values in the registers
that are used so that they can be restored before RET
Check DDR -- is it free?
Move cursor to new clean line
Prompt is starting address
of prompt string
Get next prompt character
Check for end of prompt string
Write next character of
prompt string
Increment prompt pointer
Has a character been typed?
Load it into RO
Echo input character to the monitor
Move cursor to new clean line Service routine done, restore original values in registers.
Return from trap (i.e., JMP R7)
R3,SaveR3
LEA Rl,Prompt
xFEOO
xFE02
xOOOA
Prompt .STRINGZ 11 Input a character>”
Newline .FILL
; ASCII code for newline
.END
Figure 9.4 Character input service routine
01
02
03
04
05
06
07
08
09
0A
OB
QC
OD
OE SaveRl .BLKW 1 OF .END
Figure 9.5 Character output service routine
routine. In the LC-3, the RUN latch is bit [15] of the Machine Control Register, which is memory-mapped to location xFFFE. Figure 9.6 shows the trap service routine for halting the processor, that is, for stopping the clock.
First (lines 02, 03, and 04), registers R7, RI, and RO are saved. RI and RO arc saved because they are needed by the service routine. R7 is saved because its contents will be overwritten after TRAP x21 executes (line 09). Then (lines 08 through OD), the banner Halting the machine is displayed on the monitor. Finally (lines 11 through 14), the RUN latch (MCR[15]) is cleared by ANDing the MCR with OI I Ill Ill l ll ll l l. That is, MCR[ 14:0] remains unchanged, but MCR[15] is cleared. Question: What instruction (or trap service routine) can be used to start the clock?
01 02 03 04 05 06 07 08 09 0A OB oc OD OE OF 10 11 12 13 14 15
.ORIG xFD70
ST R7, SaveR7 ST Rl, SaveRl ST RO, SaveRO
Where this routine resides
Figure 9.6
HALT service routine for the LC-3
Return
DSR DDR
LD Rl, SaveRl RET
.ORIG ST
x0430
Rl, SaveRl
; Write the character TryWrite LDI Rl, DSR
BRzp TryWrite W riteit STI RO, DDR
; return from trap
.FILL xFE04 .FILL xFE06
print message that machine is halting
LD RO, ASCIINewLine TRAP x21
LEA RO, Message
TRAP x22
LD RO, ASCIINewLine TRAP x21
clear bit 15 at xFFFE to stop the machine
LDI Rl, MCR LD RO, MASK
AND RO, Rl, RO STI RO, MCR
Load MC register into Rl RO = x7FFF
Mask to clear the top bit Store RO into MC register
9.1 LC-3 TRAP Routines 227 System call starting address
Rl will be used to poll the DSR hardware
Get status
Bit 15 on says display is ready W rite character
Restore registers
Return from trap (JMP R7, actually) Address of display status register Address of display data register
Rl: a temp· for MC register RO is used as working space
•
228 chapter 9
TRAP Routines and Subroutines return from HALT routine.
(how can this routine return if the machine is halted above?)
Rl, SaveRl ; Restore registers RO, SaveR0
R7, SaveR7
JMP R7, actually
16
17
18
19 LD lA LD 1B LD lC RET 1D
lE Some constants
lF
20 ASCIINewLine .FILL x000A 21 SaveR0 .BLKW 1
22 SaveRl .BLKW 1
23
24
25
26
27
Figure 9.6
HALT service routine for the LC-3 (continued)
SaveR7 .BLKW 1
Message MCR MASK
“Halting the machine. 11
01
02
03
04
05
06
07
08
09
QA
OB ASCII .FILL xFFD0 0C COUNT .FILL #10 OD Binary .BLKW #10
Initialize to first location Template for line 05 Initialize to 10
Get keyboard input
St.rip ASCII template Store binary digit Increment pointer Decrement COUNT.
More characters? Negative of x0030.
AGAIN
LEA R3,Binary LD R6,ASCII LD R7,COUNT TRAP x23
.STRINGZ
.FILL xFFFE Address of MCR
.FILL x7FFF ; Mask to clear the top bit .END
9.1.7 Saving and Restoring Registers
One item we have mentioned in pa~sing that we should emphasize more explicitly is the need to save the value in a register
• if the value will be destroyed by some subsequent action, and
• if we will need to use it after that subsequent action.
Suppose we want to input from the keyboard 10 decimal digits, convert their ASCII codes into their binary representations, and store the binary values in 10 successive memory locations, starting at the address Binary. The following
program fragment does the job.
R0,R0,R6 R0,R3,#D R3,R3,#1 R7,R7,#-1 AGAIN
ADD
STR
ADD
ADD
BRp
BRnzp NEXT TASK
The first step in the program fragment is initialization. We load R3 with the starting address of the memory space set aside to store the 10 decimal digits. We load R6 with the negative of the ASCII template. This is used to subtract x0030 from each ASCII code. We load R7 with I0, the initial value of the count. Then we execute the loop 10 times, each time getting a character from the keyboard, stripping away the ASCII template, storing the binary result, and testing to see if we are done. But the program does not work! Why? Answer: The TRAP instruction in line 04 replaces the value 10 that was loaded into R7 in line 03 with the address of the ADD RO,RO,R6 instruction. Therefore, the instructions in lines 08 and 09 do not perform the loop control function they were programmed to do.
The message is this: If a value in a register will be needed after something else is stored in that register, we must save it before the something else hap- pens and restore it before we can subsequently use it. We save a register value by storing it in memory; we restore it by loading it back into the register. In Figure 9.6, line 03 contains the ST instruction that saves Rl, line 11 contains the LDI instruction that loads RI with a value to do the work of the trap service rou- tine, line 19 contains the LD instruction that restores RI to its original value before the service routine was called, and line 22 sets aside a location in memory for storing RI.
The save/restore problem can be handled either by the initiating program before the TRAP occurs or by the called program (for example, the service rou- tine) after the TRAP instruction executes. We will see in Section 9.2 that the same problem exists for another class of calling/called programs, the subroutine mechanism.
We use the term caller-save if the calling program handles the problem. We use the term callee-save if the called program handles the problem. The appropriate one to handle the problem is the one that knows which registers will be destroyed by subsequent actions.
The callee knows which registers it needs to do the job of the called program. Therefore, before it starts, it saves those registers with a sequence of stores. After it finishes, it restores those registers with a sequence of loads. And it sets aside memory locations to save those register values. In Figure 9.6, the HALT routine needs RO and RI. So it saves their values with ST instructions in lines 03 and 04, restores their values with LD instructions in lines 19 and IA, and sets aside memory locations for these values in lines 21 and 22.
The caller knows what damage will be done by instructions under its control. Again, in Figure 9.6, the caller knows that each instance of the TRAP instruction will destroy what is in R7. So, before the first TRAP instruction in the HALT service routine is executed, R7 is saved. After the last TRAP instruction in the HALT service routine is executed, R7 is restored.
9.1 LC-3 TRAP Routines 229
I
•i
!
9.2 Subroutines
We have just seen how programmers’ productivity can be enhanced if they do not have to learn details of the 1/0 hardware, but can rely instead on the operating system to supply the program fragments needed to perform those tasks. We also mentioned in passing that it is kind of nice to have the operating system access these device registers so we do not have to be at the mercy of some other user programmer.
We have seen that a request for a service routine is invoked in the user program by the TRAP instruction and handled by the operating system. Return to the initiating program is obtained via the JMP R7 instruction.
In a similar vein, it is often useful to be able to invoke a program fragment multiple times within the same program without having to specify its details all over again in the source program each time it is needed. In addition, itis sometimes the case that one person writes a program that requires such fragments and another person writes the fragments.
Also, one might require a fragment that has been supplied by the manufac- turer or by some independent software supplier. It is almost always the case that collections of such fragments are available to user programmers to free them from having to write their own. These collections are referred to as libraries. An exam- ple is the Math Library, which consists of fragments that execute such functions as square root, sine, and arctangent.
For all of these reasons, it is good to have a way to use program fragments efficiently. Such program fragments are called subroutines, or alternatively, pro- cedures, or in C terminology,ftmctions. The mechanism for using them is referred to as a Call/Return mechanism.
9.2.1 The Call/Return Mechanism
Figure 9.4 provides a simple illustration of a fragment that must be executed multiple times within the same program. Note the three instructions starting at symbolic address LI. Note also the three instructions starting at addresses L2, L3, and L4. Each of these four 3-instruction sequences do the following:
LABEL LDI R3,DSR BRzp LABEL
STI Reg,DDR
Two of the four program fragments store the contents of RO and the other two store the contents of R2, but that is easy to take care of, as we will see. The main point is that, aside from the small nuisance of which register is being used for the source for the STI instruction, the four program fragments do exactly the same thing. The Call/Return mechanism allows us to execute this one 3-instruction sequence multiple times while requiring us to include it as a subroutine in our program only once.
230
chapter 9 TRAP Routines and Subroutines
XX
@
A Return
@ G)
®
®
A
y
A z A
w
Call
y
Call z Call
w
(a) Without subroutines
Figure 9.7 Instruction execution ftow with/without subroutines
The call mechanism computes the starting address of the subroutine, loads it into the PC, and saves the return address for getting back to the next instruction in the calling program. The return mechanism loads the PC with the return address. Figure 9.7 shows the instruction execution flow for a program with and without subroutines.
The Call/Return mechanism acts very much like the TRAP instruction in that it redirects control to a program fragment while saving the linkage back to the calling program. In both cases, the PC is loaded with the starting address of the program fragment, while R7 is loaded with the address that is needed to get back to the calling program. The last instruction in the program fragment, whether the fragment is a trap service routine or a subroutine, is the JMP R7 instruction, which loads the PC with the contents of R7, thereby returning control to the instruction following the calling instruction.
There is an important difference between subroutines and the service routines that are called by the TRAP instruction. Although it is somewhat beyond the scope of this course, we will mention it briefly. It has to do with the nature of the work that the program fragment is being asked to do. In the case of the TRAP instruction (as we saw), the service routines involve operating system resources, and they generally require privileged access to the underlying hardware of the computer. They are written by systems programmers charged with managing the resources of the computer. In the case of subroutines, they are either written by the same programmer who wrote the program containing the calling instruction, or they are written by a colleague, or they are provided as part of a library. In all cases, they involve resources that cannot mess up other people’s programs, and so we are not concerned that they are part of a user program.
(b) With subroutines
9 . 2
Subroutines 231
1
9.2.2 The JSR( R) Instruction
The LC-3 specifies one opcode for calling subroutines, 0100. The instruction uses one of two addressing modes for computing the starting address of the subroutine, PC-relative addressing or Base addressing. The LC-3 assembly language provides two different mnemonic names for the opcode, JSR and JSRR, depending on which addressing mode is used.
The instruction does two things. It saves the return address in R7 and it computes the starting address of the subroutine and loads it into the PC. The return address is the incremented PC, which points to the instruction following the JSR or JSRR instruction in the calling program.
The JSR(R) instruction consists of three parts.
15 14 13 12 11 10 9 8 7 6 5 4 3 2 I 0
232
chapter 9 TRAP Routines and Subroutines
•
Opcode I A
Address evaluation bits I
Bits [15: 12] contain the opcode, 0100. Bit [11] specifies the addressing mode, the value 1 if the addressing mode is PC-relative, and the value Oif the addressing mode is Base addressing. Bits [10:0] contain information that is used to evaluate the starting address of the subroutine. The only difference between JSR and JSRR is the addressing mode that is used for evaluating the starting address of the subroutine.
JSR
The JSR instruction computes the target address of the subroutine by sign- extending the 11-bit offset (bits [10:0]) of the instruction to 16 bits and adding that to the incremented PC. This addressing mode is almost identical to the addressing mode of the LD and ST instructions, except 11 bits of PCoffset are used, rather than nine bits as is the case for LD and ST.
If the following JSR instruction is stored in location x4200, its execution will cause the PC to be loaded with x3E05 and R7 to be loaded with x4201.
15 14 13 12 11 10 9 8 7 6 5 4 3
0 1 0 o1 1 10000000IOO[
JSR A PCoffsetl 1
JSRR
The JSRR instruction is exactly like the JSR instruction except for the addressing mode. JSRR obtains the starting address of the subroutine in exactly the same way the JMP instruction does, that is, it uses the contents of the register specified by bits l8:6] of the instruction.
If the following JSRR instruction is stored in location x420A, and if R5 contains x3002, the execution of the JSRR will cause R7 to be loaded with x420B, and the PC to be loaded with x3002.
Question: What important feature does the JSRR instruction provide that the JSR instruction does not provide’/
210
9.2 Subroutines 233 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
0 1 0 oIo 0 o1 1 0 110 0 0 0 0 01 JSRR A BaseR
9.2.3 The TRAP Routine for Character Input, Revisited
Let’s look again at the keyboard input service routine of Figure 9.4. In particular, let’s look at the three-line sequence that occurs at symbolic addresses LI, L2, L3, and L4:
LABEL LDI BRzp
S T I
R3,DSR LABEL Reg,DDR
Can the JSR/RET mechanism enable us to replace these four occurrences of the same sequence with a single subroutine? Answer: Yes, almost.
Figure 9.8, our “improved” keyboard input service routine, contains JSR W riteChar
at lines 05, OB, 11, and 14, and the four-instruction subroutine
WriteChar LDI R3,DSR BRzp WriteChar
S T I R2,DDR RET
at lines 1D through 20. Note the RET instruction (actually, JMP R7) that is needed to terminate the subroutine.
Note the hedging: almost. In the original sequences starting at L2 and L3, the STI instruction forwards the contents of RO (not R2) to the DDR. We can fix that easily enough, as follows: In line 09 of Figure 9.8, we use
instead of
LDR R2,Rl,#0
LDR RO,Rl,#0
This causes each character in the prompt to be loaded into R2. The subroutine Writechar forwards each character from R2 to the DDR.
In line 10 of Figure 9.8, we insert the instruction ADD R2,RD,#0
in order to move the keyboard input (which is in RO) into R2. The subroutine Writecharforwards it from R2 to the DDR. Note that RO still contains the keyboard input. Furthermore, since no subsequent instruction in the service routine loads RO, RO still contains the keyboard input after control returns to the user program.
In line 13 of Figure 9.8, we insert the instruction LD R2,Newline
in order to move the “newline” character into R2. The subroutine Writechar forwards it from R2 to the DDR.
Finally, we note that unlike Figure 9.4, this trap service routine contains several instances of the JSR instruction. Thus any linkage back to the calling
I
01 02 03 04 05 06 07 08 09 OA OB DC OD OE OF
10
11
12
13
.ORIG x04AO START ST R7,SaveR7
JSR saveReg
LO R2,Newline JSR W riteChar LEA Rl,PROMPT
Loop LOR R2,Rl,#O BRz Input
JSR WriteChar ADD Rl,Rl,#1 BRnzp Loop
Input JSR ReadChar ADD R2,R0,#0
JSR WriteChar
LD R2, Newline JSR WriteChar JSR RestoreReg LD R7,SaveR7 RET
.STRINGZ “Input a character.>”
14
15
16
17
18
19 SaveR7
lA Newline .FILL xOOOA
JMP R7 terminates the TRAP routine
1B lC 1D lE lF 20 21 22 23 24 25 26 27 28 29 2A
Prompt
W riteChar
DSR
DOR .FILL
ReadChar LOI BRzp
KBSR KBDR
.FILL xFEOO .FILL xFE02
.FILL xOOOO
LOI R3,DSR BRzp WriteChar
STI RET
R2,DDR
xFE04 xFE06
R3,KBSR ReadChar RO,KBDR
JMP R7
term inates
subroutine
.FILL
LOI RET
2B SaveReg ST Rl,SaveRl
2C ST
2D ST
2E ST
2F ST
30 ST
31 RET
32
R2,SaveR2 R3,SaveR3 R4,SaveR4 R5,SaveRS R6,SaveR6
33 RestoreReg LO Rl,SaveRl
34
35
36
37
38
39
3A SaveRl 3B SaveR2 3C SaveR3 3D SaveR4 3E SaveRS
3F SaveR6 40
LO R2,SaveR2 LD R3,SaveR3 LD R4,SaveR4 LO RS,SaveRS LO R6,SaveR6 RET
.FILL xoooo .FILL xOOOO .FILL xOOOO .FILL xOOOO .FILL xOOOO .FILL xoooo .END
Figure 9.8 The LC-3 trap service routine for character input
Get next prompt char
Move char to R2 for writing Echo to monitor
program that was contained in R7 when the service routine started execution was long ago overwritten (by the first JSR instruction, actually, in line 03). Therefore, we save R7 in line 02 before we execute our first JSR instruction, and we restore R7 in line 16 after we execute our last JSR instruction.
Figure 9.8 is the actual LC-3 trap service routine provided for keyboard input.
9.2.4 PUTS: Writing a Character String to the Monitor
Before we leave the example of Figure 9.8, note the code on lines 09 through OD. This fragment of the service routine is used to write the sequence of characters Input a character to the monitor. A sequence of characters is often referred to as a string ,ifcharacters or a character string. This fragment is also present in Figure 9.6, with the result that Halting the machine is written to the monitor. In fact, it is so often the case that a user program needs tu write a string of characters to the monitor that this function is given its own trap vector in the LC-3 operating system. Thus, if a user program requires a character string to be written to the monitor, it need only provide (in RO) the starting address of the character string, and then invoke TRAP x22. In LC-3 assembly language this TRAP is called PUTS.
Thus, PUTS (or TRAP x22) causes control to be passed to the operating system, and the procedure shown in Figure 9.9 is executed. Note that PUTS is the code of lines 09 through OD of Figure 9.8, with a few minor adjustments.
9.2.5 Library Routines
We noted early in this section that there are many uses for the Call/Return mech- anism, among them the ability of a user program to call library subroutines that are usually delivered as part of the computer system. Libraries are provided as a convenience to the user programmer. They are legitimately advertised as “pro- ductivity enhancers” since they allow the user programmer to use them without having to know or learn much of their inner details. For example, a user program- mer knows what a square root is (we abbreviate SQRT), and may need to use sqrt(x) for some value x but does not have a clue as to how to write a program to do it, and probably would rather not have to learn how.
A simple example illustrates the point. We have lost our key and need to get into our apartment. We can lean a ladder up against the wall so that the ladder touches the bottom of our open window, 24 feet above the ground. There is a
IO-foot flower bed on the ground along the edge of the wall, so we need to keep the base of the ladder outside the flower bed. How big a ladder do we need so that we can lean it against the wall and climb through the window? Or, stated less colorfully: If the sides of a right triangle arc 24 feet and 10 feet, how big is the hypotenuse (see Figure 9.10)?
We remember from high school that Pythagoras answered that one for us: c2 =a2 +h2
9.2 Subroutines 235
236
chapter 9 TRAP Routines and Subroutines
01
This service routine writes a NULL-terminated string to the console.
02
03
04
05
06
07
08
09
OA
OB
QC
OD Loop OE
OF L2 10
11
12
.ORIG x0450
ST R7, SaveR7 ST RO, SaveRO ST Rl, SaveRl ST R3, SaveR3
Where this ISR resides Save R7 for later return Save other registers that are needed by this routine
13
14 15 16 17 18 19 lA 1B lC lD lE lF 20 21 22 23
Do it all over again Return from the request for service call
Figure 9.9
The LC-3 PU TS service routine
It services the PUTS service call (TRAP x22). Inputs: RO is a pointer to the string to print.
Loop through each character in the array
Return
LD LD LD LD RET
R3, SaveR3 Rl, SaveRl RO, SaveRO R7, SaveR7
xFE04 xFE06 xOOOO xOOOO xOOOO xOOOO
Register locations
DSR DDR SaveRO SaveRl SaveR3 SaveR7
.FILL .FILL .FILL . FILL . FILL .FILL .END
LDR Rl, RO, #0 BRz Return
LDI R3,DSR BRzp L2
STI Rl, DDR ADD RO, RO, #1 BRnzp Loop
Retrieve the character(s)
If it is 0, done
Write the character Increment pointer
Figure 9.10 Solving for the length of the hypotenuse
24 feet
Ladder
10 feet
Knowing a and b, we can easily solve for c by taking the square root of the sum ofa2 and b2. Taking the sum is not hard-the LC-3 ADD instruction will do the job. The square is also not hard; we can multiply two numbers by a sequence of additions. But how does one gel the square root? The structure of our solution is shown in Figure 9.11.
The subroutine SQRT has yet to be written. If it were not for the Math Library, the programmer would have to pick up a math book (or get someone to do it for him/her), check out the Newton-Raphson method, and produce the missing subroutine.
However, with the Math Library, the problem pretty much goes away. Since the Math Library supplies a number of subroutines (including SQRT), the user programmer can continue to be ignorant of the likes of Newton-Raphson. The user still needs to know the label of the target address of the library routine that performs the square root function, where to put the argument x, and where to expect the result SQRT(x). But these arc easy conventions that can be obtained from the documentation associated with the Math Library.
01
02
03
04
05
06
07
08
09
OA
OB
oc
OD
OE
OF
10 AGAIN 11
RO,SIDEl Sl
SQUARE Rl,R0,#0 RO,SIDE2 S2
SQUARE RO,RO,Rl SQRT RO,HYPOT NEXT TASK R2,R0,#0
ADD R3,RO,#O ADD R2,R2,#-1 BRz DONE
ADD RO,RO,R3 BRnzp AGAIN
12
13
14 15 16 17 18 19 lA 1B lC lD lE lF
DONE RET SQRT
RO<-- SQRT(RO)
How do we write this subroutine?
Sl ADD LD
JSR S2 ADD JSR
BRnzp SQUARE ADD
SIDEl SIDE2 HYPOT
RET
.BLKW 1 .BLKW 1 .BLKW 1
LD BRz JSR
BRz
ST
Figure 9.11 A program fragment to compute the hypotenuse of a right triangle
9 . 2 Subroutines 237
r
238
chapter 9 TRAP Routines and Subroutines
If the library routine starts at address SQRT, and the argument is provided to the library routine at RO, and the result is obtained from the library routine at RO, Figure 9.11 reduces to Figure 9.12.
Two things are worth noting:
• Thing I - T h e programmer no longer has to worry about how to compute the square root function. The library routine does that for us.
• Thing 2 - T h e pseudo-op .EXTERNAL. We already saw in Section 7 .4.2 that this pseudo-op tells the assembler that the label (SQRT), which is needed to assemble the .FILL pseudo-op in line 19, will be supplied by some other program fragment (i.e., module) and will be combined with this program fragment (i.e., module) when the executable image is produced. The exe- cutable image is the binary module that actually executes. The executable image is produced at link time.
This notion of combining multiple modules at link time to produce an exe- cutable image is the normal case. Figure 9.13 illustrates the process. You will see concrete examples of this when we work with the programming language C in the second half of this course.
01 02 03 04 05 06 07 08 09 QA OB QC OD OE OF 10 11 12 13 14 15 16 17 18 19 lA 1B lC lD lE
1$
2$
SQUARE AGAIN
.EXTERNAL SQRT
LD RO,SIDEl BRz 1 $
J S R SQUARE ADD Rl,R0,#0 LD RO, SIDE2 BRz 2$
JSR SQUARE ADD RO,RO,Rl LD R4,BASE JSRR R4
ST RO,HYPOT BRnzp NEXT TASK ADD R2,RO,#O ADD R3,RO,#O ADD R2,R2,#-l BRz DONE
ADD RO,RO,R3 BRnzp AGAIN
RO contains argument x
DONE RET BASE .FILL
SQRT 1
1
1
SIDEl SIDE2 HYPOT
.BLKW .BLKW .BLKW
Figure 9.12 The program fragment of Figure 9.10, using a library routine
Source module A
.EXTERNAL SQRT
Object module for A
' \
Figure 9.13
An executable image constructed from multiple files
Symbol table for A
Object module for math library
\
'
'\
'
Symbol table for math library
'
/ /
/
/ /
/
Some other separately assembled module
Symbol table
/ /
/ /
/ /
/
9.2 Subroutines
239
JSRR
.END
'
'\
'
''\
'
Executable
image
Assemble
Link
\
/ /
'\
\
/
240
chapter 9 TRAP Routines and Subroutines
Exe1 c1ses
Most application software requires library routines from various libraries. It would be very inefficient for the typical programmer to produce all of them- assuming the typical programmer could produce such routines in the first place. We have mentioned routines from the Math Library. There are also a number of preprocessing routines for producing "pretty" graphic images. There are other routines for a number of other tasks where it would make no sense at all to have the programmer write them from scratch. It is much easier to require only (I) appropriate documentation so that the interface between the library routine and the program that calls that routine is clear, and (2) the use of the proper pseudo- ops such as .EXTERNAL in the source program. The linker can then produce an executable image at link time from the separately assembled modules.
9.1 Name some of the advantages of doing 1/0 through a TRAP routine instead of writing the routine yourself each time you would like your program to perform 1/0.
9.2 a. b.
c.
How many trap service routines can be implemented in the LC-3? Why?
Why must a RET instruction be used to return from a TRAP routine? Why won't a BR (Unconditional Branch) instruction work instead?
How many accesses to memory are made during the processing of a TRAP instruction? Assume the TRAP is already in the IR.
9.3 Refer to Figure 9.6, the HALT service routine.
a. What starts the clock after the machine is HALTed? Hint: How can the HALT service routine return after bit [15] of the machine control register is cleared?
b. Which instruction actually halts the machine?
c. What is the first instruction executed when the machine is started
again?
d. Where will the RET of the HALT routine return to?
9.4 Consider the following LC-3 assembly language program:
.ORIG Ll LEA AND ADD
x3000
Rl, Ll
RZ, R2, XO R2, R2, x2 R3, Pl
RO, Rl, xC
R3, R3, #-1 GLUE
Rl, Rl, R2 L2
xB
11 HBoeoakteSmtHaotrenJs1t
L2
GLUE Pl
LD LDR OUT ADD BRz ADD BR HALT
. FILL .STRINGZ .END
a. After this program is assembled and loaded, what binary pattern is stored in memory location x3005?
b. Which instruction (provide a memory address) is executed after instruction x3005 is executed?
c. Which instruction (provide a memory address) is executed prior to instruction x3006'/
d. What is the output of this program?
9.5 The following LC-3 program is assembled and then executed. There are no assemble time or run-time errors. What is the output of this program? Assume all registers are initialized to 0 before the program executes.
LABEL LABEL2
.STRINGZ .END
. ORIG ST LEA TRAP TRAP
x30DD
RO, x3D07 RO, LABEL x22
x25
r1 FUNKY 11 .STRINGZ 11 HELLO WORLD"
9.6 The correct operation of the program in Example 9.1 assumes that the person sitting at the keyboard only types capital letters and the value 7. What if the person types a$? A better program would be one that tests the character typed to be sure it really is a capital letter from among the 26 capital letters in the alphabet, and if it is not, takes corrective action. Your job: Augment the program of Example 9.1 to add a test for bad data. That is, write a program that will type the lowercase representation of any capital letter typed and will terminate if anything other than a capital letter is typed.
9.7 Two students wrote interrupt service routines for an assignment. Both service routines did exactly the same work, but the first student accidentally used RET at the end of his routine, while the second student correctly used RTL There are three errors that arose in the first student's program due to his mistake. Describe any two of them.
Exercises 241
'I
l
i I
1
I
REMOD
AND LD NOT ADD ADD
J S R BRz
ADD BRz ADD BR
242
chapter 9 TRAP Routines and Subroutines
9.8
Assume that an integer greater than 2 and less than 32,768 is deposited in memory location A by another module before the program below is executed.
9.9
In 20 words or fewer, what does the above program do?
Recall the machine busy example. Suppose the bit pattern indicating which machines are busy and which are free is stored in memory location x4001. Write subroutines that do the following.
a. Check if no machines are busy, and return I if none are busy.
b. Check if all machines are busy, and return I if all are busy.
c. Check how many machines are busy, and return the number of busy
machines.
d. Check how many machines are free, and return the number of free
machines.
e. Check if a certain machine number, passed as an argument in RS, is
busy, and return I if that machine is busy.
f Return the number of a machine that is not busy.
The starting address of the trap routine is stored at the address specified in the TRAP instruction. Why isn't the first instruction of the trap routine stored at that address instead? Assume each trap service routine requires at most 16 instructions. Modify the semantics of the LC-3 TRAP instruction so that the trap vector provides the starting address of the service routine.
9.10
STOREl ADD
STORE0
MOD
DEC
A RESUL T
ST
.ORIG x3000
R4, R4, #0 RO,A
RS, RO
R5, R5, #2 Rl, R4, #2
MOD STORE0
R7, Rl, RS STOREl
Rl, Rl, #1 REMOD
R4, R4, #1
R4, RESULT TRAP x25
ADD R2, RO, # 0 NOT R3, Rl
ADD R3, ADD R2, BRp DEC RET
.BLKW 1 ,
.BLKW ~ .END
R3, #1 R2, R3
9.11 Following is part of a program that was fed to the LC-3 assembler. The program is supposed to read a series of input lines from the console into a buffer, search for a particular character, and output the number of times that character occurs in the text. The input text is terminated by an EOT and is guaranteed to be no more than 1,000 characters in length. After the text has been input, the program reads the character to count.
The subrnutine labeled COUNT that actually does the counting was written by another person and is located at address x3500. When called, the subroutine expects the address of the buffer to be in RS and the address of the character to count to be in R6. The buffer should
have a NULL to mark the end of the text. It returns the count in
R6.
The OUTPUT subroutine that converts the binary count to ASCII digits and displays them was also written by another person and is at address x3600. It expects the number to print to be in R6.
Here is the code that reads the input and calls COUNT:
G TEXT
G CHAR
CADDR OADDR BUFFER S CHAR
.ORIG x3000
LEA R l , BUFFER TRAP x20
ADD R2, RO, x-4 BRz G CHAR
STR RO, Rl, #0 ADD Rl, Rl, #1 BRz G TEXT
Get input text
xOOOO terminates buffer Get character to count
Count character Convert R6 and display
Address of COUNT Address of OUTPUT
STR R2, TRAP x20 ST RO, LEA RS, LEA R6, LD R4, JSRR R4 LD R 4 . JSRR R4 TRAP x25
Rl, #0
S CHAR BUFFER S CHAR CADDR
OADDR
.FILL x3500 .FILL x3600 .BLKW 1001 .FILL x0000 .END
There is a problem with this code. What is it, and how might it be fixed? (The problem is not that the code for COUNT and OUTPUT is missing.)
Exercises 243
244 chapter 9 TRAP Routines and Subroutines
9.12 Consider the following LC-3 assembly language program:
.ORIG x3000 LEA RO,DATA AND Rl,Rl,#0 ADD Rl,Rl,#9
LOOPl ADD R2,R0,#0 ADD R3,Rl,#O
LOOP2 JSR
DATA .BLKW SUBl LDR
1 0 xOOOO
SUBl
ADD R4,R4,#0
BRzp LABEL
JSR SUB2 LABEL ADD R2,R2,#1
ADD R3,R3,#-1 BRP LOOP2
ADD Rl,Rl,#-1 BRp LOOPl HALT
R5,R2,#0 NOT R5,R5
ADD R5,R5,#1 LDR R6,R2,#l ADD R4,R5,R6 RET
SUB2 LDR R4,R2,#0 LDR R5,R2,#l STR R4,R2,#1 STR R5,R2,#0
RET .END
Assuming that the memory locations at DATA get filled in before the program executes, what is the relationship between the final values at DATA and the initial values at DATA?
9.13 The following program is supposed to print the number 5 on the screen. It does not work. Why? Answer in no more than ten words, please.
.ORIG x3000 JSR A
OUT
BRnzp DONE
A AND RO,R0,#0
RO,R0,#5 B
DONE HALT
ASCII .FILL
B LD Rl,ASCII
ADD RO,RO,Rl RET
.END
ADD JSR RET
xD030
9.14 Figure 9.6 shows a service routine to stop the computer by clearing the RUN latch, bit lI5Jof the Machine Control Register. The latch is cleared by the instruction in line 14, and the computer stops. What purpose is served by the instructions on lines 19 through IC?
9.15 Suppose we define a new service routine starting at memory location x4000. This routine reads in a character and echoes it to the screen. Suppose memory location x0072 contains the value x4000. The service routine is shown below.
. ORIG x4000 ST R7, SaveR7 GETC
OUT
LD R7, SaveR7 RET
b. Will this service routine work? Explain.
9.16 The two code sequences a and b are assembled separately. There is one error that will be caught at assemble time or at link time. Identify and describe why the bug will cause an error, and whether it will be detected at assemble time or link time.
a.
b.
SQRT
.ORIG x3200
ADD RO, RO, # 0
code to perform square
; root function and
; return the result in RO
RET .END
.EXTERNAL SQRT
.ORIG x3000
LD R0,VALUE JSR SQRT
ST R0,DEST HALT
.FILL x0000
SaveR7
a. Identify the instruction that will invoke this routine.
VALUE .FILL x30000 DEST .FILL x0025
.END
Exercises 245
'
246 chapter 9 TRAP Routines and Subroutines
9.17 Shown below is a partially constructed program. The program asks the user his/her name and stores the sentence "Hello, name" as a string starting from the memory location indicated by the symbol HELLO. The program then outputs that sentence to the screen. The program assumes that the user has finished entering his/her name when he/she presses the Enter key, whose ASCII code is x0A. The name is restricted to be not more than 25 characters.
Assuming that the user enters Onur followed by a carriage return when prompted to enter his/her name, the output of the program looks exactly like:
Please enter your name: Onur Hello, Onur
Insert instructions at (a)-(d) that will complete the program. .ORIG x3000
LEA AGAIN LDR
Rl,HELLO
NEXT
AGAIN2
ADD Rl,Rl,#1
BR AGAIN
LEA R0,PROMPT
TRAP x22 PUTS ------------ (a)
TRAP x20 GETC TRAP x21 OUT ADD R2,RO,R3
BRz CONT
R2,Rl,#0 BRz NEXT
------------ (b) ------------ (c) BR AGAIN2
CONT AND R2,R2,#0 ------------ (d)
LEA RO, TRAP x22 TRAP x25
HELLO
HALT NEGENTER .FILL xFFF6 -x0A
PROMPT .STRINGZ HPlease enter your name: 11
HELLO .STRINGZ "Hello, .BLKW # 2 5
.END
11
PUTS
Exercises 247 9.18 The program below, when complete, should print the following to the
monitor:
ABCFGH
Insert instructions at (a}-(d) that will complete the program.
BACK 1
NEXT 1 BACK 2
NEXT 2 SUB 1 K
DSR
DDR TESTOUT
.ORIG x3000
LEA R l , TESTOUT LDR RO, Rl, #0 BRz NEXT 1
TRAP x21
(a)
BRnzp BACK 1
LEA R l , TESTOUT LDR RO, Rl, #0 BRz NEXT 2
JSR SUB 1
ADD Rl, Rl, #1 BRnzp BACK 2
LDI R2,DSR
------------ Id)
S T I RO, DDR RET
.FILL xFE04 .FILL xFE06 .STRINGZ 11 ABC 11 .END
l b )
(c)
248
chapter 9 TRAP Routines and Subroutines
9.19 A local company has decided to build a real LC-3 computer. In order
to make the computer work in a network, four interrupt-driven 1/0 devices are connected. To request service, a device asserts its interrupt request signal (IRQ). This causes a bit to get set in a special LC-3 memory-mapped interrupt control register called INTCTL which is mapped to address xFF00. The INTCTL register is shown below. When a device requests service, the INT signal in the LC-3 data path is asserted. The LC-3 interrupt service routine determines which device has requested service and calls the appropriate subroutine for that device. If more than one device asserts its IRQ signal at the same time, only the subroutine for the highest priority device is executed. During execution of the subroutine, the corresponding bit in INTCTL is cleared.
Hard disk
IRQH
I
Ethernet card
IRQE
Printer CD-ROM
The following labels are used to identify the !irst instruction of each device subroutine:
HARDDISK ETHERNET PRINTER CDROM
For example, if the highest priority device requesting service is the printer, the interrupt service routine will call the printer subroutine with the following instruction:
JSR PRINTER
'-..
/
IROp
~
IROc
I•I I I I INTCTL
INT
Finish the code in the LC-3 interrupt service routine for the following priority scheme by filling in the spaces labeled (a}--(k). The lower the number, the higher the priority of the device.
I. Hard disk
2. Ethernet card 3. Printer
4. CD-ROM
DEVO
DEVl
DEV2
DEV3
END
LDI Rl, INTCTL
LD R2, (a) AND R2, R2, Rl BRnz DEVl
JSR ---------- (b) ---------------- (c)
LD R2, ------ (d) AND R2, R2, Rl BRnz DEV2
JSR ---------- (e)
---------------- (f)
LD R2, ------ (g) AND R2, R2, Rl
BRnz DEV3
JSR -----"·---- (h) ---------------- (i)
JSR ----------- (j) ---------------- (k)
INTCTL .FILL
MASKS .FILL MASK4 .FILL MASK2 .FILL MASKl . FILL
xFFOO x0008 x0004 xooo2 xOOOl
Exercises 249
Rnd. Finall~ . . . The Stack
We have finished our treatment of the LC-3 ISA. Before moving up another level of abstraction in Chapter 11 to programming in C, there is a particularly important fundamental topic that we should spend some time on: the stack. First we will explain in detail its basic structure. Then, we will describe three uses of the stack: (1) interrupt-driven I/O~the rest of the mechanism that we promised in Section 8.5, (2) a mechanism for performing arithmetic where the temporary storage for intermediate results is a stack instead of general purpose registers, and (3) algorithms for converting integers between 2's complement binary and ASCll character strings. These three examples arc just the tip of the iceberg. You will find that the stack has enormous use in much of what you do in computer science and engineering. We suspect you will be discovering new uses for stacks long after this book is just a pleasant memory.
We will close our introduction to the ISA level with the design of a calculator, a comprehensive application that makes use of many of the topics studied in this chapter.
10.l TheStack:ItsDasieStructure
10.1.1 The Stack-An Abstract Data Type
Throughout your future usage (or design) of computers, you will encounter the storage mechanism known as a stack. Stacks can be implemented in many different ways, and we will get to that momentarily. But first, it is important to know that the concept of a stack has nothing to do with how it is implemented. The concept of a stack is the specification of how it is to be accessed. That is, the defining
chapter
10
252
chapter 10 And, Finally ... The Stack
(a) Initial state (Empty)
(b) After one push
1996 Quarter 1998 Quarter 1982 Quarter 1995 Quarter
(c) After three pushes
1982 Quarter 1995 Quarter
(d) After two pops
Figure 10.1 A coin holder in an auto armrest-example of a stack
ingredient of a stack is that the last thing you stored in it is the first thing you remove from it. That is what makes a stack different from everything else in the world. Simply put: Last In, First Out, or LIFO.
In the terminology of computer programming languages, we say the stack is an example of an abstract data type. That is, an abstract data type is a storage mechanism that is defined by the operations performed on it and not at all by the specific manner in which it is implemented. In Chapter 19, we will write programs in C that use linked lists, another example of an abstract data type.
10.1.2 Two Example Implementations
A coin holder in the armrest of an automobile is an example of a stack. The first quarter you take to pay the highway toll is the last quarter you added to the stack of quarters. As you add quarters, you push the earlier quarters down into the coin holder.
Figure IO.I shows the behavior of a coin holder. Initially, as shown in Figure IO. la, the coin holder is empty. The first highway toll is 75 cents, and you give the toll collector a dollar. She gives you 25 cents change, a 1995 quar- ter, which you insert into the coin holder. The coin holder appears as shown in Figure IO.lb.
There are special terms for the insertion and removal of elements from a stack. We say we push an element onto the stack when we insert it. We say we pop an element from the stack when we remove it.
The second highway toll is $4.25, and you give the toll collector $5.00. She gives you 75 cents change, which you insert into the coin holder: first a 1982 quarter, then a 1998 quarter, and finally, a 1996 quarter. Now the coin holder is as shown in Figure IO.le. The third toll is 50 cents, and you remove (pop) the top two quarters from the coin holder: the 1996 quarter first and then the 1998 quarter. The coin holder is then as shown in Figure 10.1 d.
The coin holder is an example of a stack, precisely because it obeys the LIFO requirement. Each time you insert a quarter, you do so at the top. Each time you remove a quarter, you do so from the top. The last coin you inserted is the first coin you remove; therefore, it is a stack.
Another implementation of a stack, sometimes referred to as a hardware stack, is shown in Figure 10.2. Its behavior resembles that of the coin holder we just
Empty: IYes I Empty: ~ Empty: ~ Empty: ~ I/Ill/ Ill/II I/Ill/ Ill/I/ II/Ill I/Ill/ 18 Ill/II /!Ill/ I/Ill/ 31 Ill/// II/Ill Ill/// 5 18
-
II/Ill TOp 18 TOp 12 1JOp 31 i2p
(a) Initial state (b) After one push (c) After three pushes (d} After two pops
Figure 10.2 A stack, implemented in hardware-data entries move
described. It consists of some number of registers, each of which can store an element. The example of Figure 10.2 contains five registers. As each element is added to the stack or removed from the stack, the elements already on the stack move.
In Figure 10.2a, the stack is initially shown as empty. Access is always via the first element, which is labeled TOP. If the value 18 is pushed on to the stack, we have Figure 10.2b. If the three values, 3 I, 5, and 12, are pushed (in that order), the result is Figure 10.2c. Finally, if two elements are popped from the stack, we have Figure 10.2d. The distinguishing feature of the stack of Figure 10.2 is that, like the quarters in the coin holder, as each value is added or removed, all the values already on the stack move.
10.1.3 Implementation in Memory
By far the most common implementation of a stack in a computer is as shown in Figure 10.3. The stack consists of a sequence of memory locations along with a mechanism, called the stack pointer; that keeps track of the top of the stack, that is, the location containing the most recent element pushed. Each value pushed is stored in one of the memory locations. In this case, the data already stored on the stack does not physically move.
x3FFB x3FFC x3FFD x3FFE x3FFF
II/Ill x3FFB
II/Ill
12 5
31 ~ 18
x3FFE IR6 (d} After two pops
Figure 10.3
A stack, implemented in memory-data entries do not move
//!Ill x3FFB
/Ill/I x3FFC
/Ill/I x3FFD
Ill/II x3FFE
!Ill/I x3FFF TOP
----
I
II/Ill x3FFB
Ill/II x3FFC 12 TOP x3FFC
x4000
(a) Initial state
R6
I
x3FFC IR6 (c) After three pushes
II/Ill x3FFD II/Ill x3FFE
18 i.J<""' x3FFF
5 x3FFD 31 x3FFE
R6 (b) After one push
x3FFF
10.1 The Stack: Its Basic Structure
253
18
x3FFF
~
It
!! 1;
254
chapter 10
And, Finally ... The Stack
,,
:j
'I
In the example shown in Figure 10.3, the stack consists of five locations, x3FFF through x3FFB. R6 is the stack pointer.
Figure 10.3a shows an initially empty stack. Figure 10.3b shows the stack after pushing the value 18. Figure 10.3c shows the stack after pushing the values 31, 5, and 12, in that order. Figure 10.3d shows the stack after popping the top two elements off the stack. Note that those top two elements (the values 5 and 12) are still present in memory locations x3FFD and x3FFC. However, as we will see momentarily, those values 5 and 12 cannot be accessed from memory, as long as the access to memory is controlled by the stack mechanism.
Push
In Figure 10.3a, R6 contains x4000, the address just ahead of the first (BASE) location in the stack. This indicates that the stack is initially empty. The BASE address of the stack of Figure 10.3 is x3FFF.
We first push the value 18 onto the stack, resulting in Figure 10.3b. The stack pointer provides the address of the last value pushed, in this case, x3FFF, where 18 is stored. Note that the contents oflocations x3FFE, x3FFD, x3FFC, and x3FFB are not shown. As will be seen momentarily, the contents of these locations are irrelevant since they can never be accessed provided that locations x3FFF through x3FFB are accessed only as a stack.
When we push a value onto the stack, the stack pointer is decremented and the value stored. The two-instruction sequence
PUSH ADD R6,R6,#-l STR RO,R6,#0
pushes the value contained in RO onto the stack. Thus, for the stack to be as shown in Figure 10.3b, RO must have contained the value 18 before the two-instruction sequence was executed.
The three values 31, 5, and 12 are pushed onto the stack by loading each in turn into RO, and then executing the two-instruction sequence. In Figure 10.3c, R6 (the stack pointer) contains x3FFC, indicating that 12 was the last element pushed.
Pop
To pop a value from the stack, the value is read and the stack pointer is incremented. The following two-instruction sequence
POP LDR RO,R6,#0 ADD R6,R6,#l
pops the value contained in the top of the stack and loads it into RO.
If the stack were as shown in Figure 10.3c and we executed the sequence twice, we would pop two values from the stack. In this case, we would first remove the 12, and then the 5. Assuming the purpose of popping two values is to use those two values, we would, ofcourse, have to move the 12 from RO to some
other location before calling POP a second time.
Figure 10.3d shows the stack after that sequence of operations. R6 contains x3FFE, indicating that 31 is now at the top of the stack. Note that the values 12 and 5 are still stored in memory locations x3FFD and x3FFC, respectively. However, since the stack requires that we push by executing the PUSH sequence and pop by executing the POP sequence, we cannot access these two values if we obey the rules. The fancy name for "the rules" is the stack protocol.
Underflow
What happens if we now attempt to pop three values from the stack? Since only two values remain on the stack, we would have a problem. Attempting to pop items that have not been previously pushed results in an underflow situation. In our example, we can test for underflow by comparing the stack pointer with x4000, which would be the contents of R6 if there were nothing left on the stack to pop. If UNDERFLOW is the label of a routine that handles the underflow condition, our resulting POP sequence would be
Figure 10.4
POP routine, including test for underflow
POP
EMPTY
LD Rl,EMPTY ADD R2,R6,Rl BRz UNDERFLOW LDR R0,R6,#0 ADD R6,R6,#l RET
.FILL xcooo
Compare stack
pointer with x4000.
; EMPTY<-- -x4000
Rather than have the POP routine immediately jump to the UNDERFLOW routine if the POP is unsuccessful, it is often useful to have the POP routine return to the calling program, with the underflow information contained in a register.
A common convention for doing this is to use a register to provide success/ failure information. Figure 10.4 is a flowchart showing how the POP routine could be augmented, using RS to report this success/failure information.
Yes No - - - < Underflow
?
RS <-- I
RO <-- Value popped
RS <--0
10.1 The Stack: Its Basic Structure 255
256
chapter 10 And, Finally ... The Stack
Upon return from the POP routine, the calling program would examine R5 to determine whether the POP completed successfully (R5 = 0), or not (R5 = I).
Note that since the POP routine reports success or failure in R5, whatever was stored in R5 before the POP routine was called is lost. Thus, it is the job of the calling program to save the contents of R5 before the JSR instruction is executed. Recall from Section 9.1.7 that this is an example of a caller-save situation.
The resulting POP routine is shown in the following instruction sequence. Note that since the instruction immediately preceding the RET instruction set- s/clears the condition codes, the calling program can simply test Z to determine whether the POP was completed successfully.
Overflow
POP
LD Rl,EMPTY ADD R2,R6,Rl BRz Failure LDR RO,R6,#0 ADD R6,R6,#1 AND R5,R5,#0 RET
Failure AND ADD RET
R5,R5,#0 R5,R5,#l
EMPTY .FILL xCOOO
EMPTY <-- -x4000
What happens when we run out of available space and we try to push a value onto the stack? Since we cannot store values where there is no room, we have an overflow situation. We can test for overflow by comparing the stack pointer with (in the example of Figure 10.3) x3FFB. If they are equal, we have no room to push another value onto the stack. If OVERFLOW is the label of a routine that handles the overflow condition, our resulting PUSH sequence would be
PUSH
MAX
LD Rl,MAX ADD R2,R6,Rl BRz OVERFLOW ADD R6,R6,#-l STR RO,R6,#0 RET
.FILL xC005
; MAX<-- -x3FFB
In the same way that it is useful to have the POP routine return to the calling program with success/failure information, rather than immediately jump- ing to the UNDERFLOW routine, it is useful to have the PUSH routine act similarly.
We augment the PUSH routine with instructions to store 0 (success) or 1 (failure) in R5, depending on whether or not the push completed success- fully. Upon return from the PUSH routine, the calling program would examine R5 to determine whether the PUSH completed successfully (R5 = 0) or not
(R5 = 1).
Note again that since the PUSH routine reports success or failure in R5,
we have another example of a caller-save situation. That is, since whatever was stored in R5 before the PUSH routine was called is lost, it is the job of the calling program to save the contents of R5 before the JSR instruction is executed.
Also, note again that since the instruction immediately preceding the RET instruction set~/clears the condition codes, the calling program can simply test Z or P to determine whether the POP completed successfully (see the following PUSH routine).
PUSH LD ADD
BRz ADD STR AND RET
Failure AND ADD RET
Rl,MAX R2,R6,Rl Failure R6,R6,#-l RO,R6,#0 RS,RS,#0
RS,RS,#0 RS,RS,#1
MAX .FILL xCOOS MAX<-- -x3FFB
10.1.4 The Complete Picture
The POP and PUSH routines allow us to use memory locations x3FFF through x3FFB as a five-entry stack. I f we wish to push a value onto the stack, we simply load that value into RO and execute JSR PUSH. To pop a value from the stack into RO, we simply execute JSR POP. If we wish to change the location or the size of the stack, we adjust BASE and MAX accordingly.
Before leaving this topic, we should be careful to clean up one detail. The subroutines PUSH and POP make use of RI, R2, and R5. I f we wish to use the values stored in those registers after returning from the PUSH or POP routine, we had best save them before using them. In the case of Rl and R2, it is easiest to save them in the PUSH and POP routines before using them and then to restore them before returning to the calling program. That way, the calling program does not even have to know that these registers are used in the PUSH and POP routines. This is an example of the callee-save situation described in Section 9.1.7. In the
case of R5, the situation is different since the calling program does have to know the success or failure that is reported in R5. Thus, it is the job of the calling program to save the contents of R5 before the JSR instruction is executed if the calling program wishes to use the value stored there again. This is an example of the caller-save situation.
The final code for our PUSH and POP operations is shown in Figure 10.5.
10.1 The Stack: Its Basic Structure 257
258
chapter 10 And, Finally ... The Stack
01 02 03 04 05 06 07 08 09 0A OB oc OD OE OF 10 11 12 13
14
15
16
17
18
19
lA
lB
lC
lD
lE
lF BASE 20 MAX 21 Savel 22 Save2
Figure 10.5
Subroutines for carrying out the PUSH and POP functions. This program works with a stack consisting of memory locations x3FFF
(BASE) through x3FFB (MAX). R6 is the stack pointer.
POP
PUSH
ST R2,Save2
ST Rl,Savel
LD Rl,BASE
ADD Rl,Rl,#-1 ADD R2,R6,Rl
BRz fail exit LDR R0,R6,#0
ADD R6,R6,#1 BRnzp success exit ST R2,Save2
ST Rl,Savel LD Rl,MAX ADD R2,R6,Rl BRz fail exit ADD R6,R6,#-1 STR R0,R6,#0 LD Rl,Savel LD R2,Save2 AND R5,R5,#0 RET
LD Rl,Savel LD R2,Save2 AND R5,R5,#0 ADD R5,R5,#1 RET
.FILL xC00l .FILL xC005 .FILL x0000 .FILL x0000
The stack protocol
are needed by POP.
BASE contains -x3FFF.
Rl contains -x4000.
Compare stack pointer to x4000. Branch if stack is empty.
The actual 11pop11
Adjust stack pointer.
Save registers that
are needed by PUSH.
MAX contains -x3FFB
Compare stack pointer to -x3FFB. Branch if stack is full.
Adjust stack pointer. The actual 11 push11 Restore original register values. RS<-- success.
Restore original register values.
RS<-- failure.
BASE contains -x3FFF.
success exit
fail exit
10.2 Interrupt-Driven1/0[Part2)
Recall our discussion in Section 8.1.4 about interrupt-driven I/O as an alternative to polling. As you know, in polling, the processor wastes its time spinning its wheels, re-executing again and again the LDI and BR instructions until the Ready bit is set. With interrupt-driven I/O, none of that testing and branching has to go on. Instead, the processor spends its time doing what is hopefully useful work, executing some program, until it is notified that some I/O device needs attention.
You remember that there are two parts to interrupt-driven I/O:
1. the enabling mechanism that allows an I/O device to interrupt the processor when it has input to deliver or is ready to accept output, and
2. the process that manages the transfer of the I/O data.
In Section 8.5, we showed the enabling mechanism for interrupting the pro- cessor, that is, asserting the INT signal. We showed how the Ready bit, combined with the Interrupt Enable bit, provided an interrupt request signal. We showed that if the interrupt request signal is at a higher priority level (PL) than the PL of the currently executing process, the INT signal is asserted. We saw (Figure 8.8) that with this mechanism, the processor did not have to waste a lot of time polling. In Section 8.5, we could not study the process that manages the transfer of the 1/0 data because it involves the use of a stack, and you were not yet familiar with the stack. Now you know about stacks, so we can finish the explanation.
The actual management of the 1/0 data transfer goes through three stages, as shown in Figure 8.6:
I. Initiate the interrupt.
2. Service the interrupt.
3. Return from the interrupt.
We will discuss these in turn.
10.2.1 Initiate and Service the Interrupt
Recall from Section 8.5 (and Figure 8.8) that an interrupt is initiated because an 1/0 device with higher priority than the currently running program has caused the INT signal to be asserted. The processor, for its part, tests for the presence of INT each time it completes an instruction cycle. If the test is negative, business continues as usual and the next instruction of the currently running program is fetched. If the test is positive, that next instruction is not fetched.
Instead, preparation is made to interrupt the program that is running and execute the interrupt service routine that deals with the needs of the 1/0 device that has requested this higher priority service. Two steps must be carried out: (I) Enough of the state of the program that is running must be saved so we can later continue where we left off, and (2) enough of the state of the interrupt service routine must be loaded so we can begin to service the interrupt request.
The State of a Program
The state of a program is a snapshot of the contents of all the resources that the program affects. It includes the contents of the memory locations that are part of the program and the contents of all the general purpose registers. It also includes two very important registers, the PC and the PSR. The PC you are very familiar with; it contains the address of the next instruction to be executed. The PSR, shown here, is the Processor Status Register. It contains several important pieces of information about the status of the running program.
15 14 13 12 II 10 9 8 7654 3 2 I 0
PL I
Priv Priority cond codes
PSR
PSR[l5] indicates whether the program is running in privileged (supervi- sor) or unprivileged (user) mode. In privileged mode, the program has access to
10.2 Interrupt-Driven 1/0 (Part 2) 259
2b0
chapter 10 And, Finally ... The Stack
important resources not available to user programs. We will see momentarily why that is important in dealing with interrupts. PSR[I0:8j specifies the priority level (PL) or sense of urgency of the execution of the program. As has been mentioned previously, there are eight priority levels, PLO (lowest) to PL7 (highest). Finally, PSR[2:0] is used to store the condition codes. PSR[2) is the N bit, PSR[l) is the Z bit, and PSR[0J is the P bit.
Saving the State of the Interrupted Program
The first step in initiating the interrupt is to save enough of the state of the program that is running so it can continue where it left off after the I/0 device request has been satisfied. That means, in the case of the LC-3, saving the PC and the PSR. The PC must be saved since it knows which instruction should be executed next when the interrupted program resumes execution. The condition codes (the N, Z, and P flags) must be saved since they may be needed by a subsequent conditional branch instruction after the program resumes execution. The priority level of the interrupted program must be saved because it specifies the urgency of the interrupted program with respect to all other programs. When the interrupted program resumes execution, it is important to know what priority level programs can interrupt it again and which ones can not. Finally, the privilege level of the program must be saved since it contains information about what processor resources the interrupted program can and can not access.
It is not necessary to save the contents of the general purpose registers since we assume that the service routine will save the contents of any general pur- pose register it needs before using it, and will restore it before returning to the interrupted program.
The LC-3 saves this state information on a special stack, called the Supervisor Stack, that is used only by programs that execute in privileged mode. A section of memory is dedicated for this purpose. This stack is separate from the User Stack, which is accessed by user programs. Programs access both stacks using R6 as the stack pointer. When accessing the Supervisor Stack, R6 is the Supervisor Stack Pointer. When accessing the User Stack, R6 is the User Stack Pointer. Two internal registers, Saved.SSP and Saved.USP, are used to save the stack pointer not in use. When the privilege mode changes from user to supervisor, the contents of R6 are saved in Saved.USP, and R6 is loaded with the contents of Saved.SSP before processing begins.
That is, before the interrupt service routine starts, R6 is loaded with the contents of the Supervisor Stack Pointer. Then PC and PSR of the interrupted program are pushed onto the Supervisor Stack, where they remain unmolested while the service routine executes.
Loading the State of the Interrupt Service Routine
Once the state of the interrupted program has been safely saved on the Supervisor Stack, the second step is to load the PC and PSR of the interrupt service routine. Interrupt service routines are similar to the trap service routines discussed in Chapter 9. They are program fragments stored in some prearranged set oflocations in memory. They service interrupt requests.
Most processors use the mechanism of vectored interrupts. You are famil- iar with this notion from your study of the trap vector contained in the TRAP instruction. In the case of interrupts, the 8-bit vector is provided by the device that is requesting the processor be interrupted. That is, the I/0 device transmits to the processor an 8-bit interrupt vector along with its interrupt request signal and its priority level. The interrupt vector corresponding to the highest priority interrupt request is the one supplied to the processor. It is designated INTV. If the interrupt is taken, the processor expands the 8-bit interrupt vector (INTV) to form a 16-bit address, which is an entry into the Interrupt Vector Table. Recall from Chapter 9 that the Trap Vector Table consists of memory locations xOOOO to xOOFF, each containing the starting address of a trap service routine. The Interrupt
Vector Table consists of memory locations xOlOO to xOlFF, each containing the starting address of an interrupt service routine. The processor loads the PC with the contents of the address formed by expanding the interrupt vector INTV.
The PSR i, loaded as follows: Since no instructions in the service routine have yet executed, PSR[2:0] is initially loaded with zeros. Since the interrupt service routine runs in privileged mode, PSR[15] is set to 0. PSR[l0:8] is set to the priority level associated with the interrupt request.
This completes the initiation phase and the interrupt service routine is ready to go.
Service the Interrupt
Since the PC contains the starting address of the interrupt service routine, the service routine will execute, and the requirements of the 1/0 device will be serviced.
For example, the LC-3 keyboard could interrupt the processor every time a key is pressed by someone sitting at the keyboard. The keyboard interrupt vector would indicate the handler to invoke. The handler would then copy the contents of the data register into some preestablished location in memory.
10,2.2 Return from the Interrupt
The last instruction in every interrupt service routine is RTI, return from interrupt. When the processor finally accesses the RTI instruction, all the requirements of the 1/0 device have been taken care of.
Execution of the RTI instruction (opcode = 1000) consists simply of pop- ping the PSR and the PC from the Supervisor Stack (where they have been resting peacefully) and restoring them to their rightful places in the processor. The condi- tion codes are now restored to what they were when the program was interrupted, in case they are needed by a subsequent BR instruction in the program. PSR[l5] and PSR[ 10:8] now reflect the privilege level and priority level of the about-to-be- resumed program. Similarly, the PC is restored to the address of the instruction that would have been executed next if the program had not been interrupted.
With all these things as they were before the interrupt occurred, the program can resume as if nothing had happened.
10.2 Interrupt-Driven l/0 (Part 2)
261
262
chapter 10 And, Finally ... The Stack Program A
x3000
x3006 ADD
x3010~ -----
Figure 10.6 Execution flow for interrupt-driven I/0 10.2.3 An Example
Service routine
for device B
AND x6202
RTI x6210 L_-~---'
x6300
We complete the discussion of interrupt-driven 1/0 with an example.
Suppose program A is executing when 1/0 device B, having a PL higher than that of A, requests service. During the execution of the service routine for 1/0
device B, a still more urgent device C requests service.
Figure I0.6 shows the execution flow that must take place.
Program A consists of instructions in locations x3000 to x30JO and was in
the middle of executing the ADD instruction at x3006, when device B sent its interrupt request signal and accompanying interrupt vector xFI, causing INT to be asserted.
Note that the interrupt service routine for device B is stored in locations x6200 to x6210; x6210 contains the RTI instruction. Note that the service routine for B was in the middle of executing the AND instruction at x6202, when device C sent its interrupt request signal and accompanying interrupt vector xF2. Since the request associated with device C is of a higher priority than that of device B, INT is again asserted.
Note that the interrupt service routine for device C is stored in locations x6300 to x63 I5; x63 l 5 contains the RTI instruction.
Let us examine the order of execution by the processor. Figure 10.7 shows several snapshots of the contents of the Supervisor Stack and the PC during the execution of this example.
The processor executes as follows: Figure I0.7a shows the Supervisor Stack and the PC before program A fetches the instruction at x3006. Note that the stack pointer is shown as Saved.SSP, not R6. Since the interrupt has not yet occurred, R6 is pointing to the current contents of the User Stack. The INT signal (caused by an interrupt from device B) is detected at the end of execution of the instruction
x6200
Service routine for device C
c__R_TI_ _ _ _
x6315
PC L I _
_ x3_0_0_6_~ (a)
PC L I _
_ x6_2_00_~ (b)
PSR for device B x6203
PSR of program A x3007
R6
PSR for device B x6203
PSR of program A x3007
Pel x6203
~----
PC
L I _
_ x3_0_0_7_~
(e)
(d)
Saved.
SSP
PSR of program A x3007
R6
PSR for device B R6 x6203
PSR of program A
x3007
PC l~_x6_30_0_~
Figure 10.7 Snapshots of the contents of the Supervisor Stack and the PC during interrupt-driven l/0
in x3006. Since the state of program A must be saved on the Supervisor Stack, the first step is to start using the Supervisor Stack. This is done by saving R6 in the Saved.USP register, and loading R6 with the contents of the Saved.SSP register. The address x3007, the PC for the next instruction to be executed in program A, is pushed onto the stack. The PSR of program A, which includes the condition codes produced by the ADD instruction, is pushed onto the stack. The interrupt vector associated with device Bis expanded to 16 bits x0IFI, and the contents of xOIFI (x6200) are loaded into the PC. Figure 10.7b shows the stack and PC at this point.
The service routine for device B executes until a higher priority interrupt is detected at the end of execution of the instruction at x6202. The address x6203 is pushed onto the stack, along with the PSR of the service routine for B, which includes the condition codes produced by the AND instruction. The interrupt vector associated with device C is expanded to 16 bits (x01F2), and the contents ofx0IF2 (x6300) are loaded into the PC. Figure 10.7c shows the Supervisor Stack and PC at this point.
10.2
Interrupt-Driven I/0 (Part 2) 263
Saved.SSP
(c)
264
chapter 10 And, Finally ... The Stack
The interrupt service routine for device C executes to completion, finishing with the RTI instruction in x6315. The Supervisor Stack is popped twice, restoring the PSR of the service routine for device B, including the condition codes produced by the AND instruction in x6202, and restoring the PC to x6203. Figure 10.7d shows the stack and PC at this point.
The interrupt service routine for device B resumes execution at x6203 and runs to completion, finishing with the RTI instruction in x6210. The Supervisor Stack is popped twice, restoring the PSR of program A, including the condition codes produced by the ADD instruction in x3006, and restoring the PC to x3007. Finally, since program A is in User Mode, the contents of R6 are stored in Saved.SSP and R6 is loaded with the contents of Saved.USP. Figure I0.7e shows the Supervisor Stack and PC at this point.
Program A resumes execution with the instruction at x3007.
10.3 ArithmeticUsingaStack
10.3.1 The Stack as Temporary Storage
There are computers that use a stack instead of general purpose registers to store temporary values during a computation. Recall that our ADD instruction
ADD RO,Rl,R2
takes source operands from RI and R2 and writes the result of the addition into RO. We call the LC-3 a three-address machine because all three locations (the two sources and the destination) are explicitly identified. Some computers use a stack for source and destination operands and explicitly identify none of them. The instruction would simply be
ADD
We call such a computer a stack machine, or a zero-address machine. The hardware would know that the source operands are the top two elements on the stack, which would be popped and then supplied to the ALU, and that the result of the addition would be pushed onto the stack.
To perform an ADD on a stack machine, the hardware would execute two pops, an add, and a push. The two pops would remove the two source operands from the stack, the add would compute their sum, and the push would place the result back on the stack. Note that the pop, push, and add are not part of the ISA of that computer, and therefore not available to the programmer. They are control signals that the hardware uses to make the actual pop, push, and add occur. The control signals are part of the microarchitecture, similar to the load enable signals and mux select signals we discussed in Chapters 4 and 5. As is the case with LC-3 instructions LD and ST, and control signals PCMUX and LD.MDR, the programmer simply instructs the computer to ADD, and the microarchitecture does the rest.
Sometimes (as we will see in our final example of this chapter), it is useful to process arithmetic using a stack. Intermediate values are maintained on the
stack rather than in general purpose registers, such as the LC-3's RO through R7. Most general purpose microprocessors, including the LC-3, use general purpose registers. Most calculators use a stack.
10.3.2 An Example
Forexample, suppose we wanted to evaluate (A+ B) -(C + D), where A contains 25, B contains 17, C contains 3, and D contains 2, and store the result in E. If the LC-3 had a multiply instruction (we would probably call it MUL), we could use the following program:
LD RO,A
LD Rl,B
ADD RO,RO,Rl LD R2,C
LD R3 ,D
ADD R2,R2,R3 MUL RO,RO,R2 ST RO,E
With a calculator, we could execute the following eight operations:
!1) push 25
(2) push 17
(3) add
(4) push 3
(5) push 2
(6) add
(7) multiply
(8) pop E
with the final result popped being the result of the computation, that is, 210. Figure 10.8 shows a snapshot of the stack after each of the eight operations.
In Section 10.5, we write a program to cause the LC-3 (with keyboard and monitor) to act like such a calculator. We say the LC-3 simulates the calculator when it executes that program.
But first, let's examine the subroutines we need to conduct the various arithmetic operations.
10.3.3 0pAdd, 0pMult, and 0pNeg
The calculator we simulate in Section 10.5 has the ability to enter values, add, subtract, multiply, and display results. To add, subtract, and multiply, we need three subroutines:
1. OpAdd, which will pop two values from the stack, add them, and push the result onto the stack.
10.3 Arithmetic Using a Stack 265
266
chapter 10
And, Finally ... The Stack
II I II II I II II I II II I II II Ill
x4000
(a) Before
IIIll II I II II I II 17
42 x3FFF
x3FFB x3FFC x3FFD x3FFE x3FFF
Stack pointer
x3FFB x3FFC x3FFD x3FFE x3FFF
Stack pointer
I I I II I II II II I/ I II Ill 25
x3FFF
x3FFB x3FFC x3FFD x3FFE x3FFF
Stack pointer
I I Ill I II I I IIIII
17 25
x3FFE
x3FFB x3FFC x3FFD x3FFE x3FFF
Stack pointer
(d) After first add
(e) After third push
(f) After fourth push
I II II IIIII 2
5 42
x3FFE
x3FFB x3FFC x3FFD x3FFE x3FFF
Stack pointer
I II I I I II I I 2
5 210
x3FFB x3FFC x3FFD x3FFE x3FFF
II I II II I II 2
5 210
I x4000
(i) After pop
x3FFB x3FFC x3FFD x3FFE x3FFF
Stack pointer
(b) After first push
(c) After second push
x3FFB x3FFC x3FFD x3FFE x3FFF
Stack pointer
IIIII I I II I II I I I 3
42 x3FFE
IIIII
I x3FFF
(h) After multiply
(g) After second add
Figure 10.8 Stack usage during the computation of (25 + 17) . (3 + 2)
x3FFB
x3FFC IIIII
x3FFD
x3FFE
x3FFF 42
Stack pointer
x3FFD
I Stack pointer
2. OpMult, which will pop two values from the stack, multiply them, and push the result onto the stack.
3. OpNeg, which will pop the top value, form its 2's complement negative value, and push the result onto the stack.
The OpAdd Algorithm
Figure I0.9 shows the flowchart of the OpAdd algorithm. Basically, the algorithm attempts to pop two values off the stack and, if successful, add them. If the result is within the range of acceptable values (that is, an integer between -999 and +999), then the result is pushed onto the stack.
There are two things that could prevent the OpAdd algorithm from completing successfully: Fewer than two values are available on the stack for source operands,
2 3
Start
OK?
OK?
Range OK?
No
No
No
Stop Figure 10.9 Flowchart for OpAdd algorithm
Put back both
Put back 1st POP
or the result is out of range. In both cases, the stack is put back to the way it was at the start of the OpAdd algorithm, a l is stored in R5 to indicate failure, and control is returned to the calling program. If the first pop is unsuccessful, the stack is not changed since the POP routine leaves the stack as it was. If the second of the two pops reports back unsuccessfully, the stack pointer is decremented, which effectively returns the first value popped to the top of the stack. If the result is outside the range of acceptable values, then the stack pointer is decremented twice, returning both values to the top of the stack.
The OpAdd algorithm is shown in Figure IO. IO.
Note that the OpAdd algorithm calls the RangeCheck algorithm. This is a simple test to be sure the result of the computation is within what can successfully
10.3 Arithmetic Using a Stack 267
268
chapter 10 And, Finally ... The Stack
01
02
03 04 05 06 07 08 09 DA OB DC OD OE OF 10 11 12 13
Routine to pop the top two elements from the stack,
add them, and push the sum onto the stack. R6 is the stack pointer.
OpAdd JSR ADD BRp ADD JSR ADD BRp ADD JSR
POP RS,RS,#0 Exit Rl,RD,#0 POP RS,RS,#0 Restorel R0,R0,Rl RangeCheck
Get first source operand.
Test if POP was successful. Branch if not successful.
Make room for second operand. Get second source operand.
Test if POP was successful.
Not successful, put back first. THE Add.
Check size of result.
Out of range, restore both. Push sum on the stack.
On to the next task ... Decrement stack pointer. Decrement stack pointer.
RET Restore2 ADD Restorel ADD 14 Exit RET
Figure 10.10 The OpAdd algorithm
BRp Restore2
JSR
PUSH
R6,R6,#-l R6,R6,#-l
Print (Number ou1 of range)
RS <-- 1
RS <-- 0
Figure 10.11 The RangeCheck algorithm flowchart
Yes
X > #999 ?
Yes
No
be stored in a single stack location. For our purposes, suppose we restrict values to integers in the range -999 to +999. This will come in handy in Section 10.5 when we design our home-brew calculator. The flowchart for the RangeCheck algorithm is shown in Figure 10.11. The LC-3 program that implements this algorithm is shown in Figure 10.12.
01 02 03 04 05 06 07 08 09 0A OB oc OD OE OF 10 11 12 13 14 15 16 17 18
The OpMult Algorithm
Figure 10.13 shows the flowchart of the OpMult algorithm, and Figure 10.14 shows the LC-3 program that implements that algorithm. Similar to the OpAdd algorithm, the OpMult algorithm attempts to pop two values off the stack and, if successful, multiplies them. Since the LC-3 docs not have a multiply instruction, multiplication is performed as we have done in the past as a sequence of adds. Lines 17 to I9 of Figure 10.14 contain the crux of the actual multiply. If the result is within the range of acceptable values, then the result is pushed onto the stack.
If the second of the two pops reports back unsuccessfully, the stack pointer is decremented, which effectively returns the first value popped to the top of the stack. If the result is outside the range of acceptable values, which as before will be indicated by a 1 in RS, then the stack pointer is decremented twice, returning both values to the top of the stack.
The OpNeg Algorithm
We have provided algorithms to add and multiply the top two elements on the stack. To subtract the top two elements on the stack, we can use our OpAdd algorithm if we first replace the top of the stack with its negative value. That is, if the top of the stack contains A, and the second element on the stack contains B,
Routine to check that the magnitude of a value is between -999 and +999.
Range Check LD ADD
BRp LD ADD BRn AND RET
BadRange ST LEA
TRAP x22
LD R7,Save AND RS,RS,#0 ADD R5,R5,#l RET
Neg999 .FILL #-999 Pos999 .FILL #999 Save .FILL x0O00 RangeErrorMsg .FILL xO00A
R5,Neg999 R4,RO,R5 BadRange R5,Pos999 R4,RO,RS BadRange RS,RS,#0
Recall that RO contains the result being checked.
RS<-- success
R7,Save R0,RangeErrorMsg
.STRINGZ "Error: Number is out of range." Figure 10.12 The RangeCheck algorithm
10.3 Arithmetic Using a Stack 269
R7 is needed by TRAP/RET. Output character string
RS<-- failure
270
chapter 10
And, Finally ... The Stack Start
No
No
Yes
Flag? Correct product
PUSH
Put back both
Put back 1st POP
Figure 10.13 Flowchart for the OpMull algorithm
OK?
OK?
No
No
Stop
01 02 03 04 05 06 07 08 09 OA OB oc OD OE OF 10 11 12 13 14 15 16
17 MultLoop 18
19
lA
1B
Algorithm to pop two values from the stack, multiply them, and if their product is within the acceptable range, push the result onto the stack. R6 is stack pointer.
OpMult AND JSR ADD BRp ADD JSR ADD BRp ADD
R3,R3,#0 POP RS,RS,#0 Exit Rl,R0,#0 POP RS,R5,#0 Restorel R2,R0,#0
R3 holds sign of multiplier.
Get first source from stack.
Test for successful POP.
Failure
Make room for next POP.
Get second source operand. Test for successful POP. Failure; restore first POP. Moves multiplier, tests sign.
Sets FLAG: M ultiplier is neg.
R2 contains - (multiplier). Clear product register.
M ultiplier= 0, Done.
THE actual "multiply" Iteration Control
RS contains success/failure.
Test for negative multiplier.
Adjust for
sign of result.
Push product on the stack.
Adjust stack pointer. Adjust stack pointer.
PosMultiplier AND
ADD R2,R2,#0
BRz PushMult
ADD RO,RO,Rl ADD R2,R2,#-l BRp MultLoop
JSR RangeCheck ADD R5,R5,#0 BRp Restore2
PUSH
ADD R6,R6,#-1 ADD R6,R6,#-1
Figure 10.14 The OpMult algorithm
lC
lD
1E
lF
20
21
22
23
2 4
25 Restore2
26 Restorel
27 Exit RET
PushMult JSR RET
BRzp PosMultiplier
ADD NOT ADD
R3,R3,#1 R2,R2 R2,R2,#l RO,R0,#0
ADD R3,R3,#0 ERZ PushMult NOT RO,RO ADD RO,R0,#1
and we wish to pop A, Band push B - A , we can accomplish this by first negating the top of the stack and then performing OpAdd.
The algorithm for negating the element on the top of the stack, OpNeg, is shown in Figure 10.15.
10.3 Arithmetic Using a Stack 271
272
chapter 10 And, Finally ... The Stack
Algorithm to ; and push the
pop the top result onto
POP R5,R5,#0 Exit RO,RO RO,R0,#1 PUSH
of the stack, form its negative, the stack.
Get the source operand. Test for successful pop Branch if failure.
Form the negative of source. Push result onto the stack.
01
02
03
04
05
06
07
08
09
OA Exit RET
OpNeg JSR ADD BRp NOT ADD JSR
Figure 10.15 The OpNeg algorithm
10.4 DatalijpeConversion
It has been a long time since we talked about data types. We have been exposed to several data types: unsigned integers for address arithmetic, 2's complement integers for integer arithmetic, 16-bit binary strings for logical operations, floating point numbers for scientific computation, and ASCII codes for interaction with input and output devices.
It is important that every instruction be provided with source operands of the data type that the instruction requires. For example, ADD requires operands that are 2's complement integers. If the ALU were supplied with floating point operands, the computer would produce garbage results.
It is not uncommon in high-level language programs to find an instruction of the form A = R + I where R (floating point) and I (2's complement integer) are represented in different data types.
If the operation is to be performed by a floating point adder, then we have a problem with I. To handle the problem, one must first convert the value / from its original data type (2's complement integer) to the data type required by the operation (floating point).
Even the LC-3 has this data type conversion problem. Consider a multiple- digit integer that has been entered via the keyboard. It is represented as a string of ASCII characters. To perform arithmetic on it, you must first convert the value to a 2's complement integer. Consider a 2's complement representation of a value that you wish to display on the monitor. To do so, you must first convert it to an ASCII string.
In this section, we will examine routines to convert between ASCII strings of decimal digits and 2's complement binary integers.
10.4.1 Example: The Bogus Program: 2 +3 =e
First, let's examine Figure 10.16, a concrete example of how one can get into trouble if one is not careful about keeping track of the data type of each of the values with which one is working.
Suppose we wish to enter two digits from the keyboard, add them, and dis- play the results on the monitor. At first blush, we write the simple program of Figure 10.16. What happens?
01 TRAP x23
Input from the keyboard. Make room for another input.
Input another character.
Add the two inputs.
Display result on the monitor. Halt.
02 ADD
03 TRAP x23
OS 06
04 ADD
Rl,R0,#0
RO,Rl,RO TRAP X21
TRAP x25
Figure 10.16 ADDITION without paying attention to data types
Suppose the first digit entered via the keyboard is a 2 and the second digit entered via the keyboard is a 3. What will be displayed on the monitor before the program terminates? The value loaded into RO as a result of entering a 2 is the ASCII code for 2, which is x0032. When the 3 is entered, the ASCII code for 3, which is x0033, will be loaded. Thus, the ADD instruction will add the two binary strings x0032 and x0033, producing x0065. When that value is displayed on the monitor, it will be treated as an ASCII code. Since x0065 is the ASCII code for a lowercase e, that is what will be displayed on the monitor.
The reason why we did not get 5 (which, at last calculation, was the correct result when adding 2 + 3) was that we didn't (a) convert the two input char- acters from ASCII to 2's complement integers before performing addition and (b) convert the result back to ASCH before displaying it on the monitor.
Exercise: Correct Figure 10.16 so that it will add two single-digit positive integers and give a single-digit positive sum. Assume that the two digits being added do in fact produce a single-digit sum.
10.4.2 ASCII to Binary
It is often useful to deal with numbers that require more than one digit to express them. Figure 10.17 shows the ASCII representation of the three-digit number 295, stored as an ASCII string in three consecutive LC-3 memory locations, starting at ASCIIBUFF.Rl contains the number of decimal digits in the number.
Note that in Figure 10.17, a whole LC-3 word (16 bits) is allocated for each ASCII character. One can (and, in fact, more typically, one does) store each ASCII character in a single byte of memory. In this example, we have decided to give each ASCII character its own word of memory in order to simplify the algorithm.
x0032 ASCIIBUFF x0039
x0035
3 R1
Figure 10.17 The ASCII representation of 295 stored in consecutive memory locations
10.4 Data Type Conversion
273
1 '
Figure 10.18 shows the flowchart for converting the ASCII representation of Figure 10.17 into a binary integer. The value represented must be in the range 0 to +999, that is, it is limited to three decimal digits.
The algorithm systematically takes each digit, converts it from its ASCII code to its binary code by stripping away all but the last four bits, and then uses it to index into a table of 10 binary values, each corresponding to the value of one
R0<-0
Yes R1? = 0 .------~ (nodigitsleft)
No
R4 < - Units digit R0<-RO+R4 R1 < - R1 - 1
Yes R1? = o .-----< (no digits left)
No
R4 < - Tens digit RO < - RO + 10 • R4 R1 < - R - 1
Yes R1? =o .----< (no digits left)
No
R4 < - Hundreds digit RO < - RO + 100 • R4
Done
274
chapter 10 And, Finally ... The Stack
Figure 10.18 Flowchart, algorithm for ASCII-to-binary conversion
of the IO digits. That value is then added to RO. RO is used to accumulate the contributions of all the digits. The result is returned in RO.
Figure 10.19 shows the LC-3 program that implements this algorithm.
01
02
03
04
05
06
07
08
09
OA
OB
QC
OD
OE
OF
10
11
12
13
14
15
16
17
18
19
lA
1B
lC
lD
lE
lF
20
21
22
23
24
25
26
27
28
29
2A DoneAtoB RET
2B NegASCIIOffset .FILL xFFDO 2C ASCIIBUFF . BLKW 4
2D LookUplO .FILL #0 Figure 10.19 ASCII-to-binary conversion routine
This algorithm takes an ASCII string of three decimal digits and
converts it into a binary number. RO is used to collect the result.
Rl keeps track of how many digits are left to process. ASCIIBUFF contains the most significant digit in the ASCII string.
ASCIItoBinary
AND RO,R0,#0 ADD Rl,Rl,#0 BRz DoneAtoB
RO will be used for our result. Test number of digits.
There are no digits.
LDR R4,R2,#0 ADD R4,R4,R3 ADD RO,RO,R4
R4 <-- "ones" digit
Strip off the ASCII template. Add ones contribution.
The original number had one digit.
R2 now points to 11 tens11 digit.
R4 <-- 11tens11 digit
Strip off ASCII template.
LookUplO is BASE of tens values. RS points to the right tens value.
Add tens contribution to total.
The original number had two digits.
ADD Rl,Rl,#-1 BRz DoneAtoB ADD R2,R2,#-l
LDR R4,R2,#0 ADD R4,R4,R3
LEA R5,LookUplO ADD R5,R5,R4 LDR R4,RS,#0 ADD RO,RO,R4
ADD Rl,Rl,#-l BRz DoneAtoB ADD R2,R2,#-l
R2 now points to 11 hundreds11 R4 <-- 11hundreds11 digit
Strip off ASCII template.
digit.
LDR R4,R2,#0
ADD R4,R4,R3
LEA RS,LookUplOO ADD RS, R5,R4
LDR R4,RS,#O
ADD RO,RO,R4
LookUplOO i s RS points to
Add hundreds
hundreds BASE. hundreds value.
10.4 Data Type Conversion 275
LD R3,NegASCIIOffset ; R3 gets xFFDO, i.e., -x0030. LEA R2,ASCIIBUFF
ADD R2,R2,Rl
ADD R2,R2,#-l
R2 now points to 11ones11 digit.
contribution to total.
276 chapter 10
And, Finally ... The Stack
2E
2F
30
31
32
33
34
35
36
37
38 LookUplOO 39
3A 3B 3C 3D 3E 3F 40 41
.FILL #10 .FILL #20 .FILL #30 .FILL #40 .FILL #SO .FILL #60 .FILL #70 .FILL #80 .FILL #90
.FILL #0 .FILL #100 .FILL #200 .FILL #300 .FILL #400 .FILL #500 .FILL #600 .FILL #700 .FILL #800 .FILL #900
Figure 10.19 ASCII-to-binary conversion routine (continued)
10.4.3 Binary to ASCII
Similarly, it is useful to convert the 2's complement integer into an ASCII string so that it can be displayed on the monitor. Figure I0.20 shows the algorithm for converting a 2's complement integer stored in RO into an ASCII string stored in four consecutive memory locations, starting at ASCIIBUFF. The value initially in RO is restricted to be within the range -999 to +999. After the algorithm completes execution, ASCIIBUFF contains the sign of the value initially stored in RO. The following three locations contain the three ASCII codes corresponding to the three decimal digits representing its magnitude.
The algorithm works as follows: First, the sign of the value is determined, and the appropriate ASCII code is stored. The value in RO is replaced by its absolute value. The algorithm determines the hundreds-place digit by repeatedly subtracting 100 from RO until the result goes negative. This is next repeated for the tens-place digit. The value left is the ones digit.
Exercise: [Very challengingl Suppose the decimal number is arbitrarily long. Rather than store a table of IO values for the thousands-place digit, another table for the IO ten-thousands-place digit, and so on, design an algorithm to do the conversion without resorting to any tables whatsoever. See Exercise 10.20.
Exercise: This algorithm always produces a string of four characters inde- pendent of the sign and magnitude of the integer being converted. Devise an algorithm that eliminates unnecessary characters in common representations, that is, an algorithm that does not store leading zeros nor a leading + sign. See Exercise 10.22.
01
02
03
04
05
06
07
08
09
QA
OB
OC
OD
OE
OF
10
11
12
13
14
15
16
17
18
19
lA
1B
lC
lD
lE
lF
20
21
22
23
24
25
26
27
29
2A
2B
2C
2D
2E ASCIIplus
2F ASCIIminus .FILL 30 ASCIIoffset .FILL
10.4 Data Type Conversion 277
Rl,ASCIIBUFF Rl points to string being generated.
31 NeglOO 32 PoslOO 33 NeglO
.FILL .FILL .FILL
This algorithm takes the 2's complement representation of a signed integer within the range -999 to +999 and converts it into an ASCII string consisting of a sign digit, followed by three decimal digits. RO contains the initial value being converted.
BinarytoASCII LEA
ADD RO,R0,#0
RO contains the binary value.
First store the ASCII plus sign.
First store ASCII minus sign.
Convert the number to absolute
value; it is easier to work with. Prepare for "hundreds 11 digit.
NegSign
BeginlOO
LooplOO
EndlOO
BeginlO LooplO
EndlO Beginl
BRn NegSign
LD R2,ASCIIplus STR R2,Rl,#O BRnzp BeginlOO
LD R2,ASCIIminus STR R2,Rl,#D
NOT RO,RO
ADD RO,R0,#1
figure 10.20 Binary-to-ASCII conversion routine
LD R2,ASCIIoffset LD R3,Negl00
ADD RO,RO,R3 BRn EndlOO ADD R2,R2,#1 BRnzp LooplOO
STR R2,Rl,jtl LD R3,Pos100 ADD RO,RO,R3
;
.FILL x002B x002D x0030 xFF9C x0064 xFFF6
; Determine the hundreds digit.
Store ASCII code for hundreds digit.
Correct RO for one-too-many subtracts.
LD R2,ASCIIoffset ; Prepare for "tens" digit.
; Determine the tens digit.
LD R3,Neg10 ADD RO,RO,R3 BRn EndlO ADD R2,R2,#l BRnzp LooplO
STR R2,Rl,#2
ADD RO,R0,#10 ; Correct RO for one-too-many subtracts. LD R2,ASCIIoffset ; Prepare for "Ones" digit.
ADD R2,R2,RO
STR R2,Rl,#3
RET
; Store ASCII code for. tens digit.
1
278
chapter 10 And, Finally ... The Stack
10.S OurFinalExample:TheCalculator
We conclude Chapter 10 with the code for a comprehensive example: the simula- tion of a calculator. The intent is to demonstrate the use of many of the concepts. discussed thus far, as well as to show an example of well-documented, clearly written code, where the example is much more complicated than what can fit on one or two pages. The calculator simulation consists of 11 separate routines.
You are encouraged to study this example before moving on to Chapter 11 and High-Level Language Programming.
The calculator works as follows: We use the keyboard to input commands and decimal values. We use the monitor to display results. We use a stack to perform arithmetic operations as described in Section 10.2. Values entered and displayed arc restricted to three decimal digits, that is, only values between -999 and +999, inclusive. The available operations are
X Exit the simulation.
D Display the value at the top of the stack.
C Clear all values from the stack.
+ Replace the top two elements on the stack with their sum.
* Replace the top two elements on the stack with their product.
- Negate the top element on the stack.
Enter Push the value typed on the keyboard onto the top of the stack.
Figure 10.21 isaflowchartthatgivesanoverviewofourcalculatorsimulation. Simulation of the calculator starts with initialization, which includes setting R6, the stack pointer, to an empty stack. Then the user sitting at the keyboard is prompted for input.
Input is echoed, and the calculator simulation systematically tests the char- acter to determine the user's command. Depending on the user's command, the calculator simulation carries out the corresponding action, followed by a prompt for another command. The calculator simulation continues in this way until the user presses X, signaling that the user is finished with the calculator.
Eleven routines comprise the calculator simulation. Figure I0.22 is the main algorithm. Figure I0.23 takes an ASCII string of digits typed by a user, converts it to a binary number, and pushes the binary number onto the top of the stack. Figure
I0.19 provides the ASCII-to-binary conversion routine. Figure 10.26 pops the entry on the top of the stack, converts it to an ASCII string, and displays the ASCII string on the monitor. Figure I0.20 provides the binary-to-ASCII conversion routine. Figures 10.10 (OpAdd), 10.14 (OpMult), and 10.15 (OpNeg) supply the basic arithmetic algorithms using a stack. Figures I0.24 and 10.25 contain versions of the POP and PUSH routines tailored for this application. Finally, Figure 10.27 clears the stack.
Start
Initialize
(i.e., clear stack)
Prompt user Get char
Yes
No
C No
+ No
No
No
Yes
No
Push value
(see Figure 10.23)
(uses ASCII-to-binary conversion)
Exit
Clear stack (see Figure
OpAdd (see Figure
OpMult (see Figure
OpNeg (see Figure
Display
(see Figure
(uses binary-to-ASCII conversion)
Prompt user Get char
Figure 10.21 The calculator, overview
X
D
10.24)
Yes
Yes
Yes
Yes
10.27)
10.10)
10.14)
10.15)
10.5 Our Final Example: The Calculator 279
280
chapter 10
And, Finally ... The Stack
01 02
03
04
05
06 LEA 07 PUTS 08 GETC 09 OUT OA
OB Check the command QC
The Calculator, Main Algorithm
OD Test LD Rl,NegX
OE OF 10 11 12 13 14 15 16 17 18 19 lA 1B
lC lD lE lF 20 21 22 23 24 25
ADD Rl,Rl,RO BRz Exit
LD Rl,Negc ADD Rl,Rl,RO
26
27
28
29
2A
2B
2C
2D
2E Exit HALT
2F PromptMsg .FILL xOOOA 30
NewCommand
LEA RO,PromptMsg PUTS
GETC
31 NegX .FILL
32 NegC . FILL
33 NegPlus .FILL
34 NegMinus .FILL
35 NegMult .FILL
36 NegD .FILL
xFFA8 xFFBD xFFD5 xFFD3 xFFD6 xFFBC
Figure 10.22 The calculator's main algorithm
LEA
ADD R6,R6,#-1
BRz
LD
ADD Rl,Rl,RO
BRz
R6,StackBase
Initialize the stack.
R6 is stack pointer.
Check for X.
Check for C.
See Figure 10.27. Check for+
See Figure 10.10. Check for*
See Figure 10.14. Check for -
See Figure 10.15. Check for D
See Figure 10.24.
See Figure 10.23.
RO,PromptMsg
OpClear
OpAdd
Rl,NegPlus
LD Rl,NegMult ADD Rl,Rl,RO BRz OpMult
LD Rl,NegMinus
ADD BRz
Rl,Rl,RO OpNeg
LD Rl,NegD ADD Rl,Rl,RO
BRz
OpDisplay
Then we must be entering an integer
BRnzp
PushValue
OUT BRnzp
Test
.STRINGZ 11 Enter a command: 11
14 15 16 17 18 19 lA 1B lC lD lE lF 20 21 22 23 24
NOT ADD ADD JSR JSR BRnzp
TooLargeinput GETC OUT
ADD BRnp LEA PUTS BRnzp
R3,RO,xFFF6 GoodInput R2,R2,#0 TooLargeinput R2,R2,#-l RO,Rl,#0 Rl,Rl,#1
ValueLoop
R2,ASCIIBUFF R2,R2 R2,R2,#l Rl,Rl,R2 ASCIItoBinary PUSH NewCommand
Test for carriage return.
Still room for more digits. Store last character read.
Echo it.
Rl now contains no. of char.
Spin until carriage return.
TooManyDigits .FILL
.STRINGZ 11 Too many digits 11
xOOOA MaxDigits .FILL x0003
Figure 10.23 The calculator's PushValue routine
Note that a few changes are needed if the various routines are to work with the main program of Figure l0.17. For example, OpAdd, OpMult, and OpNeg must all terminate with
BRnzp NewCommand
instead ofRET. Also, some labels are used in more than one subroutine. Ifthe sub- routines are assembled separately and certain labels arc identified as .EXTERNAL (see Section 9.2.5), then the use of the same label in more than one subroutine is not a problem. However, if the entire program is assembled as a single module, then duplicate labels are not allowed. In that case, one must rename some of the labels (e.g., Restore!, Restore2, Exit, and Save) so that all labels are unique.
10.5 Our Final Example: The Calculator 281
This algorithm takes a sequence of ASCII digits typed by the user, converts it into a binary value by calling the ASCIItoBinary subroutine, and pushes the binary value onto the stack.
01
02
03
04
05
06
07
08 ValueLoop ADD 09 BRz OA ADD OB BRz oc ADD OD STR OE ADD OF GETC 10 OUT 11 BRnzp 12
13 Goodinput LEA
Rl, ASCIIBUFF Rl points to string being R2,MaxDigits generated.
PushValue
LEA LD
R3,RO,xFFF6 TooLargeinput RO,TooManyDigits
NewComrnand
282
chapter 10 And, Finally ... The Stack
01 02 03 04 05 06 07 08 09 0A OB oc
POP
This algorithm POPs a value from the stack and puts it in
RO before returning to the calling program. RS is used to
report success (RS= 0) or failure (RS= 1) of the POP operation.
LEA R0,StackBase
NOT R0,R0
ADD R0,R0,#2 RO -(addr.ofStackBase -1)
ADD R0,R0,R6 R6 StackPointer
OD Underflow OE
OF
10
11
12
13
R7,Save R0,UnderflowMsg
The actual POP Adjust StackPointer RS<-- success
TRAP/RET needs R7.
Print error message.
Restore R7.
RS<-- failure
14 save
15 StackMax .BLKW
16 StackBase .FILL x0000 17 UnderflowMsg .FILL
18
Figure 10.24 The calculator's POP routine
Exercises
10.1 10.2
10.3
What are the defining characteristics of a stack?
What is an advantage to using the model in Figure 10.3 to implement a stack versus the model in Figure 10.2?
The LC-3 ISA has been augmented with the following Push and Pop instructions. Push Rn pushes the value in Register n onto the stack. Pop Rn removes a value from the stack and loads it into Rn. The figure below shows a snapshot of the eight registers of the LC-3 BEFORE and AFTER the following six stack operations are performed. Identify (a}-(d).
BRz
LDR R0,R6,#0 ADD R6,R6,#l AND RS,RS,#0 RET
ST
LEA
PUTS
LD R7,Save
AND RS,RS,#0 ADD RS,R5,#l RET
.FILL x0000
Underflow
9
x000A
.STRINGZ "Error: Too Few Values on the Stack."
BEFORE RO xoooo Rl xllll
R2 x2222
R3 x3333
R4 x4444
RS xssss
R6 x6666
R7 x7777
A F T E R
PUSH R 4 PUSH (a) Rl POP ( b ) RO PUSH I c ) R3 POP R2 R4 POP (di RS
RO xllll xllll X3333 x3333 x4444 xssss x6666 x4444
R6 R7
01 02 03 04 05 06 07 08 09 QA OB QC OD OE OF 10 11 12 13 14 15 16 17 18 19 lA 1B
This algorithm PUSHes on the stack the value stored in RO. R5 is used to report success (R5 = 0) or failure (R5 = 1) of the PUSH operation.
01 02 03 04 05 06 07 08 09 0A OB QC OD OE
This algorithm calls BinarytoASCII to convert the 2's complement number on the top of the stack into an ASCII character string, and then calls PUTS to display that number on the screen.
01
02
03
04
05
06
PUSH ST LEA
Rl, Savel Rl,StackMax Rl,Rl
Rl,Rl,#1 Rl,Rl,R6 Overflow R6,R6,#-l RO,R6,#0 Success exit R7,Save R0,OverflowMsg
R7,Save Rl, Savel RS,RS,#0 RS,RS,#1
Rl,Savel
RS,R5,#0
Rl is needed by this routine.
Rl - addr. of StackMax R6 StackPointer
Adjust StackPointer for PUSH. The actual PUSH
Restore Rl.
RS<-- failure
Restore Rl. RS<-- success
Overflow
Success exit
NOT ADD ADD BRz ADD STR BRnzp ST LEA PUTS LD
LD
AND
ADD RET LD AND RET
Save .FILL
Savel .FILL
OverflowMsg .STRINGZ "Error: Stack is Full."
Figure 10.25 The calculator's PUSH routine
OpDisplay
JSR ADD BRp JSR LD OUT LEA
PUTS ADD BRnzp
POP
RS,RS,#0 NewCommand BinarytoASCII R0,NewlineChar
R0,ASCIIBUFF
R6,R6,#-1 NewCommand x000A
RO gets the value to be displayed. POP failed, nothing on the stack.
Push displayed number back on stack.
NewlineChar .FILL
Figure 10.26 The calculator's display routine
x0000 x0000
This routine clears the stack by resetting the stack pointer (R6).
BRnzp NewCommand Figure 10.27 The OpClear routine
R6,StackBase Initialize the stack.
OpClear LEA
ADD R6,R6,#1 R6 is stack pointer.
Exercises 283
284
chapter 10 And, Finally ... The Stack
10.4 Write a function that implements another stack function, peek. Peek returns the value of the first element on the stack without removing the element from the stack. Peek should also do underflow error checking. (Why is overflow error checking unnecessary?)
10.5 How would you check for underflow and overflow conditions if you implemented a stack using the model in Figure 10.2? Rewrite the PUSH and POP routines to model a stack implemented as in Figure I0.2, that is, one in which the data entries move with each
operation.
10.6 Rewrite the PUSH and POP routines such that the stack on which they operate holds elements that take up two memory locations
each.
10.7 Rewrite the PUSH and POP routines to handle stack elements of arbitrary sizes.
10.8 The following operations are performed on a stack:
PUSH A, PUSH B, POP, PUSH C, PUSH D, POP, PUSH E,
POP, POP, PUSH F
a. What does the stack contain after the PUSH F?
b. At which point does the stack contain the most elements? Without
removing the elements left on the stack from the previous operations, we perform:
PUSH G, PUSH H, PUSH I , PUSH J , POP , PUSH K, POP, POP, POP, PUSH L, POP, POP, PUSH M
c. What does the stack contain now?
10.9 The input stream of a stack is a list of all the elements we pushed onto the stack, in the order that we pushed them. The input stream from Exercise 10.8 was ABCDEFGHIJKLM
The output stream is a list of all the elements that are popped off the stack, in the order that they are popped off.
a. What is the output stream from Exercise 10.8? Hint: BDE ...
b. If the input stream is ZYXWVUTSR, create a sequence of pushes and pops such that the output stream is YXVUWZSRT.
c. If the input stream is ZYXW, how many different output streams can be created?
10.10 During the initiation of the interrupt service routine, the N, Z, and P condition codes are saved on the stack. Show by means of a simple example how incorrect results would be generated if the condition codes were not saved.
10.11 In the example of Section 10.2.3, what are the contents of locations x0lFl and x01F2? They are part of a larger structure. Provide a name for that structure. (Hint: See Table A.3.)
10.12 Expand the example of Section 10.2.3 to include an interrupt by a still more urgent device D while the service routine of device C is executing the instruction at x6310. Assume device D's interrupt vector is xF3. Assume the interrupt service routine is stored in locations x6400 to
x64 l 2. Show the contents of the stack and PC at each relevant point in the execution flow.
10.13 Suppose device Din Exercise 10.12 has a lower priority than device C but a higher priority than device B. Rework Exercise 10.12 with this new wrinkle.
10.14 Write an interrupt handler to accept keyboard input as follows: A buffer is allocated to memory locations x4000 through x40FE. The interrupt handler must accept the next character typed and store it in the next "empty" location in the buffer. Memory location x40FF is used as a pointer to the next available empty buffer location. If the buffer is full (i.e., if a character has been stored in location x40FE), the interrupt handler must display on the screen: "Character cannot be accepted; input buffer full."
10.15 Consider the interrupt handler of Exercise 10.14. The buffer is modified as follows: The buffer is allocated to memory locations x4000 through x40FC. Location x40FF contains, as before, the address of the next available empty location in the buffer. Location x40FE contains the address of the oldest character in the buffer. Location x40FD contains tbe number of characters in the buffer. Other programs can remove characters from the buffer. Modify the interrupt handler so that, after x40FC is filled, the next location filled is x4000, assuming the character in x4000 has been previously removed. As before, if the buffer is full, the interrupt handler must display on the screen: "Character cannot be accepted; input buffer full."
10.16 Consider the modified interrupt handler of Exercise 10.15, used in conjunction with a program that removes characters from the buffer. Can you think of any problem that might prevent the interrupt handler that is adding characters to the buffer and the program that is removing characters from the buffer from working correctly together?
10.17 Describe, in your own words, how the Multiply step of the OpMult algorithm in Figure 10.14 works. How many instructions are executed to perform the Multiply step? Express your answer in terms of n, the value of the multiplier. (Note: If an instruction executes five times, it contributes 5 to the total count.) Write a program fragment that performs the Multiply step in fewer instructions if the value of the multiplier is less than 25. How many?
Exercises 285
286
chapter 10 And, Finally ... The Stack
10.18
10.19
10.20
10.21
10.22
Correct Figure 10.16 so that it will add two single-digit positive integers and produce a single-digit positive sum. Assume that the two digits being added do in fact produce a single-digit sum.
Modify Figure 10.16, assuming that the input numbers are one-digit positive hex numbers. Assume that the two hex digits being added together do in fact produce a single hex-digit sum.
Figure 10.19 provides an algorithm for converting ASCII strings to binary values. Suppose the decimal number is arbitrarily long. Rather than store a table of 10 values for the thousands-place digit, another table for the 10 ten-thousands-place digit, and so on, design an algorithm to do the conversion without resorting to any tables whatsoever.
The code in Figure 10.19 converts a decimal number represented as ASCII digits into binary. Extend this code to also convert a hexadecimal number represented in ASCII into binary. If the number is preceded by an x, then the subsequent ASCII digits (three at most) represent a hex number; otherwise it is decimal.
The algorithm of Figure 10.20 always produces a string of four characters independent of the sign and magnitude of the integer being converted. Devise an algorithm that eliminates unnecessary characters in common representations, that is, an algorithm that does not store leading Os nor a leading+ sign.
10.23 What does the following LC-3 program do?
LOOP
INPUTDONE LOOP2
DONE PUSH
POP
Rl, #0
RO, #-10
PROMPT
STACKSPAC .BLKW #50 STACKBASE .FILL #0
a
.ORIG X3000
LEA R6, ST ACKBASE LEA RO, PROMPT TRAP x22
AND Rl,
TRAP x20
TRAP x21
ADD R3I
BRz INPUTDONE
JSR PUSH
ADD Rl, Rl, #1 BRnzp LOOP
ADD Rl, Rl, #0 BRz DONE
JSR POP
TRAP x21
ADD Rl, Rl, #-1 BRp LOOP2
TRAP x25
PUTS
IN
Check for newline
ADD R6, STR RO, RET
LOR RO, ADD R6, RET
R6, #-2 R6, #0
R6, #0 R6, #2
HALT
sentence; ''
.STRINGZ '' Please enter .END
Exercises 287
288 chapter 10 And, Finally ... The Stack
10.24 Suppose the keyboard interrupt vector is x34 and the keyboard interrupt service routine starts at location xlO00. What can you infer about the contents of any memory location from the above statement?
chapter
11
Introduction to Programming in C 11.l Our Objective
Congratulations, and welcome to the second half of the book! You just completed an introduction to the basic underlying structure of modern computer systems. With this foundational material solidly in place, you are now well prepared to learn the fundamentals of programming in a high-level programming language.
In the second half of this book, we will discuss high-level programming con- cepts in the context of the C programming language. At every step, with every new high-level concept, we will be able to make a connection to the lower lev- els of the computer system. From this perspective, nothing will be mysterious. We approach the computer system from the bottom up in order to reveal that there indeed is no magic going on when the computer executes the programs you write. It is our belief that with this mystery removed, you will compre- hend programming concepts more quickly and deeply and in turn become better programmers.
Let's begin with a quick overview of the first half. In the first 10 chapters, we described the LC-3, a simple computer that has all the important characteristics of a more complex, real computer. A basic idea behind the design of the LC-3 (and indeed, behind all modern computers) is that simple elements are system- atically interconnected to form more sophisticated devices. MOS transistors are connected to build logic gates. Logic gates are used to build memory and data path elements. Memory and data path elements are interconnected to build the LC-3. This systematic connection of simple elements to create something more sophisticated is an important concept that is pervasive throughout computing, not only in hardware design but also in software design. It is this simple design
290
chapter 11 Introduction to Programming in C
philosophy that enables us to build computing systems that are, as a whole, very complex.
After describing the hardware of the LC-3, we described how to program it in the ls and Os of its native machine language. Having gotten a taste of the error- prone and unnatural process of programming in machine language, we quickly moved lo the more user-friendly LC-3 assembly language. We described how to decompose a programming problem systematically into pieces that could be easily coded on the LC-3. We examined how low-level TRAP subroutines perform
commonly needed tasks, such as input and output, on behalf of the programmer. The concepts of systematic decomposition and subroutines are important not only for assembly-level programming but also for programming in a high-level language. You will continue to see examples o f these concepts many times before the end of the book.
In this half of the book, our primary objectives are to introduce fundamen- tal high-level programming constructs-variables, control structures, functions, arrays, pointers, recursion, simple data structures-and to teach a good problem- solving methodology for attacking programming problems. Our primary vehicle for doing so is the C programming language. It is not our objective to provide a complete coverage of C, but only the portions essential for a novice programmer to gain exposure to the fundamentals of programming and to be able to write fairly sophisticated programs. For the reader curious about aspects of C not cov- ered in the main text, we provide a more complete description of the language in
Appendix D.
In this chapter, we make the transition from programming in low-level assembly language lo high-level language programming in C. We'll explain why high-level languages came about, why they are important, and how they interact with the lower levels of the computing system. We'll then dive headfirst into C by examining a simple example program. Using this example, we point out some important details that you will need to know in order to start writing your own Ccode.
11.2 BridgingtheGap
As computing hardware becomes faster and more powerful, software applications become more complex and sophisticated. New generations of computer systems spawn new generations of software that can do more powerful things than previ- ous generations. As the software gets more sophisticated, the job of developing it becomes more difficult. To keep the programmer from being quickly over- whelmed, it is critical that the. process of programming be kept as simple as possible. Automating any part of this process (i.e., having the computer do part of the work) is a welcome enhancement.
As we made the transition from LC-3 machine language in Chapters 5 and 6 to LC-3 assembly language in Chapter 7, you no doubt noticed and appreciated how assembly language simplified programming the LC-3. The ls and Os became mnemonics, and memory addresses became symbolic labels. Both instructions and memory addresses took on a form more comfortable for the human than for
the machine. The assembler filled some of the gap between the algorithm level and the ISA level in the levels of transformation (see Figure 1.6). It would be desirable for the language level to fill more of that gap. High-level languages do just that. They help make the job of programming easier. Let's look at some ways in which they help.
• High-level languages allow us to give symbolic names to values. When programming in machine language, ifwe want to keep track ofthe iteration count of a loop, we need to set aside a memory location or a register in which to store the counter value. To access the counter, we need to remember the spot where we last stored it. The process is easier in assembly language because we can assign a meaningful label to the counter's memory location. In a higher-level language such as C, the programmer simply assigns the value a name (and, as we will sec later, provides a type) and the programming language takes care of allocating storage for it and performing the appropriate data movement operations whenever the programmer refers to it. Since most programs contain many values, having such a convenient way to handle values is a critically useful enhancement.
• High-levellanguagesprovideexpressiveness.Mosthumansarcmorecom- fortable describing the interaction of objects in the real world than describing the interaction of objects such as integers, characters, and floating-point numbers in the digital world. Because of their human-friendly orientation, high-level lan- guages enable the programmer to be more expressive. In a high-level language, the programmer can express complex tasks with a smaller amount of code, with the code itself looking more like a human language. For example, if we wanted to calculate the area of a triangle, we could simply write:
area= 0.5 *base* height;
Another example: we often write code to test a condition and do something if the condition is true or do something else if the condition is false. In high-level languages, such common tasks can be simply stated in an English-like form. For example, if we want to get (Umbrella) if the condition isitCloudy is true, otherwise get (Sunglasses) if it is false, then in C we can use the following C control structure:
if (isitCloudy) get (Umbrella) ;
else get(Sunglasses);
• High-level languages provide an abstraction of the underlying hardware. In other words, high-level languages provide a uniform interface independent of underlying ISA or hardware. For example, often a programmer will want to do an operation that is not naturally supported by the instruction set. In the LC-3, there is no one instruction that performs an integer multiplication. Instead, an LC-3 assembly language programmer must write a small piece of code to perform multiplication. The set of operations supported by a high-level language is usually larger than the set supported by the ISA. The language will generate the
11.2 Bridging the Gap
291
292
chapter 11 Introduction to Programming in C
necessary code to carry out the operation whenever the programmer uses it. The programmer can concentrate on the actual programming task knowing that these high-level operations will be performed correctly and without having to deal with the low-level implementation.
• High-level languages enhance code readability. Since common control structures arc expressed using simple, English-like statements, the program itself becomes easier to read. One can look at a program in a high-level language, notice loops and decision constructs, and understand the code with less effort than with a program written in assembly language. As you will no doubt discover if you have not already, the readability of code is very important in programming. Often as programmers, we are given the task of debugging or building upon someone else's code. If the organization of the language is human-friendly to begin with, then understanding code in that language is a much simpler task.
• Many high-level languages provide safeguards against bugs. By making the programmer adhere to a strict set of rules, the language can make checks as the program is translated or as it is executed. If certain rules or conditions are violated, an error message will direct the programmer to the spot in the code where the bug is likely to exist. In this manner, the language helps the programmer to get his/her program working more quickly.
11.3 TranslatingHigh-LevelLanguagePrograms
Just as LC-3 assembly language programs need to be translated (or more spe- cifically, assembled) into machine language, so must all programs written in high-level languages. After all, the underlying hardware can only execute machine code. How this translation is done depends on the particular high-level language. One translation technique is called interpretation. With interpretation, a trans- lation program called an interpreter reads in the high-level language program and performs the operations indicated by the programmer. The high-level lan- guage program does not directly execute but rather is executed by the interpreter program. The other technique is called compilation, and the translator, called a compiler, completely translates the high-level language program into machine language. The output of the compiler is called the executable image, and it can directly execute on the hardware. Keep in mind that both interpreters and compilers are themselves programs running on the computer system.
11.3.1 Interpretation
With interpretation, a high-level language program is a set of commands for the interpreter program. The interpreter reads in the commands and carries them out as defined by the language. The high-level language program is not directly executed by the hardware but is in fact just input data for the interpreter. The interpreter is a virtual machine that executes the program. Many interpreters translate the high- level language program section by section, one line, command, or subroutine at a time.
For example, the interpreter might read a single line of the high-level language program and directly carry out the effects of that line on the underlying hardware. If the line said, "Take the square root of B and store it into C," the interpreter will carry out the square root by issuing the correct stream of instructions in the ISA of the computer to perform square root. Once the current line is processed, the interpreter moves on to the next line and executes it. This process continues until the entire high-level language program is done.
High-level languages that are often interpreted include LISP, BASIC, and Perl. Special-purpose languages tend to be interpreted, such as the math language called Matlab. The LC-3 simulator is also an interpreter. Other examples include the UNIX command shell.
11.3.2 Compilation
With compilation, on the other hand, a high-level language program is translated into machine code that can be directly executed on the hardware. To do this effectively, the compiler must analyze the source program as a larger unit (usually, the entire source file) before producing the translation. A program need only be compiled once and can be executed many times. Many programming languages, including C, C++, and FORTRAN, are typically compiled. The LC-3 assembler is an example of a rudimentary compiler. A compiler processes the file (or files) containing the high-level language program and produces an executable image. The compiler does not execute the program (though some sophisticated compilers do execute the program in order to better optimize its performance), but rather only transforms it from the high-level language into the computer's native machine language.
11.3.3 Pros and Cons
There are advantages and disadvantages with either translation technique. With interpretation, developing and debugging a program is usually easier. Interpreters often permit the execution of a program one section (single line, for example) at a time. This allows the programmer to examine intermediate results and make code modifications on-the-fly. Often the debugging is easier with interpretation. Inter- preted code is more easily portable across different computing systems. However, with interpretation, programs take longer to execute because there is an inter- mediary, the interpreter, which is actually doing the work. With the compiler's assistance, the programmer can produce code that executes more quickly and uses memory more efficiently. Since compilation produces more efficient code, most commercially produced software tends to be programmed in compiled languages.
11.4 ThecProgramming~anguage
The C programming language was developed in 1972 by Dennis Ritchie at Bell Laboratories. C was developed for use in writing compilers and operating systems, and for this reason the language has a low-level bent to it. The language allows
11.4 The C Programming Language 293
294
· chapter 11 COBOL
Introduction to Programming in C
P@scal Ada:
Figure 11.1
A timeline of the development of programming languages. While each new language shares some link to all previous languages, there is a strong relationship between C and both C++ and Java
the programmer to manipulate data items at a very low level yet still provides the expressiveness and convenience of a high-level language. It is for these reasons that C is very widely used today as more than just a language to develop compilers and system software.
The C programming language has a special place in the evolution of program- ming languages. Figure 11.1 provides a timeline of the development of some of the more significant programming languages. Starting with the introduction of the first high-level programming language FORTRAN in 1954, each subsequent language was an attempt to fix the problems with its predecessors. While it is somewhat difficult to completely track the "parents" of a language (in fact, one can only surely say that all previous languages have some influence on a particu- lar language), it is fairly clear that Chad a direct influence on C++ and Java, both of which are two of the more significant languages today. C++ and Java were also influenced by Simula and its predecessors. The object-oriented features of C++ and Java come from these languages. Almost all of the aspects of the C programming language that we discuss in this textbook would be the same if we were programming in C++ or Java. Once you've understood the concepts in this half of the textbook, both C++ and Java will also be easier to master because of their similarity to C.
Because of its low-level approach and because of its root influence on other currentmajorlanguages,Cisthelanguageofchoiceforourbottom-upexploration of computing systems. C allows us to make clearer connections to the underlying levels in our discussions of basic high-level programming concepts. Learning more advanced concepts, such as object-oriented programming, is a shorter leap forward once these more fundamental, basic concepts are understood.
All of the examples and specific details of C presented in this text are based on a standard version of C called ANSI C. As with many programming languages, several variants of C have been introduced throughout the years. In 1989, the American National Standards Institute (ANSI) approved "an unambiguous and machine-independent definition of the language C" in order to standardize the popular language. This version is referred to as ANSI C. ANSI C is supported by most C compilers. In order to compile and try out the sample code in this textbook, having access to an ANSI-compliant C compiler will be essential.
FORTRAN
P l / 1 BASIC
1955
1960 1965
f>rolog
1970
1995 2000
LISP
Scheme
1975 1980
Perl
1985 1990
11.4.1 The C Compiler
The C compiler is the typical mode of translation from a C source program to an executable image. Recall from Section 7.4.1 that an executable image is a machine language representation of a program that is ready to be loaded into memory and executed. The entire compilation process involves the preprocessor, the compiler itself, and the linker. Often, the entire mechanism is casually referred to as the compiler, because when we use the C compiler, the preprocessor and the linker are often automatically invoked. Figure 11.2 shows how the compilation process is handled by these components.
Figure 11.2
The dotted box indicates the overal I compilation process-the preprocessor, the compiler, and the linker. The entire process is called compilation even though the compiler is only one part of it. The inputs are C source and header files and various object files. The output is an executable image.
Object files
Library object files
Linker
Executable image
Source code analysis
j
Target code synthesis
Preprocessed source code
Compiler
‘ ‘ ‘ ‘•
, Symbol table
/ /
Object module
11.4 The C Programming Language 295
C
source and header files
C preprocessor
/
‘
r ‘l
296
chapter 11 Introduction to Programming in C
The Preprocessor
As its name implies, the C preprocessor “preprocesses” the C program before handing it off to the compiler. The C preprocessor scans through the source files (the source files contain the actual C program) looking for and acting upon C preprocessor directives. These directives are similar to pseudo-ops in LC-3 assembly language. They instruct the preprocessor to transform the C source file in some controlled manner. For example, we can direct the preprocessor to substitute the character string DAYS_THIS_MONTH with the string 30 or direct it to insert the contents of file s t d i o . h into the source file at the current line. We’ll discuss why both of these actions are useful in the subsequent chapters.
All preprocessor directives begin with a pound sign, #, as the fir.st character. All useful C programs rely on the preprocessor in some way.
The Compiler
After the preprocessor transforms the input source file, the program is ready to be handed over to the compiler. The compiler transforms the preprocessed program into an object module. Recall from Section 7.4.2 that an object module is the machine code for one section of the entire program. There are two major phases of compilation: analysis, in which the source program is broken down or parsed into its constituent parts, and synthesis, in which a machine code version of the program is generated. It is the job of the analysis phase to read in, parse, and build an internal representation of the original program. The synthesis phase generates machine code and, if directed, attempts to optimize this code to execute more quickly and efficiently on the computer on which it will be run. Each of these two phases is typically divided into subphases where specific tasks, such as parsing, register allocation, or instruction scheduling, are accomplished. Some compil- ers generate assembly code and use an assembler to complete the translation to machine code.
One of the most important internal bookkeeping mechanisms the com- piler uses in translating a program is the symbol table. A symbol table is the compiler’s internal bookkeeping method for keeping track of all the symbolic names the programmer has used in the program. The C compiler’s symbol table is very similar to the symbol table maintained by the LC-3 assembler (see Section 7.3.3). We’ll examine the C compiler’s symbol table in more detail in the next chapter.
The Linker
The linker takes over after the compiler has translated the source file into object code. It is the linker’s job to link together all object modules to form an executable image of the program. The executable image is a version of the program that can be loaded into memory and executed by the underlying hardware. When you click on the icon for the web browser on your PC, for example, you are instructing the operating system to read the web browser’s executable image from your hard drive, load it into memory, and start executing it.
Often, C programs rely upon library routines. Library routines perform common and useful tasks (such as 1/0) and are prepared for general use by
the developers of the system software (the operating system and compiler, for example). If a program uses a library routine, then the linker will find the object code corresponding to the routine and link it within the final executable image. This process of linking in library objects should not be new to you; we described the process in Section 9.2.5 in the context of the LC-3. Usually, library objects are stored in a particular place depending on the computer system. In UNIX, for example, many common library objects can be found in the directory
/usr/lib.
11.S RSimpleExample
We are now ready to start discussing programming concepts in the C programming language. Many ofthe new C concepts we present will be coupled with LC-3 code generated by a “hypothetical” LC-3 C compiler. In some cases, we will describe what actually happens when this code is executed. Keep in mind that you are not likely to be using an LC-3~based computer but rather one based on a real ISA such as the x86. For example, if you are using a Windows-based PC, then it is likely that your compiler will generate x86 code, not LC-3 code.
Many of the examples we provide are complete programs that you can com- pile and execute. For the sake of clearer illustration, some of the examples we provide arc not quite complete programs and need to be completed before they can be compiled. In order to keep things straight, we’ll refer to these partial code examples as code segments.
Let’s begin by diving headfirst into a simple C example. Figure 11.3 shows its source code. We will use this example to jump-start the process of learning C by pointing out some important aspects of a typical C program. The example is a simple one: It prompts the user to type in a number and then counts down from that number to 0.
You are encouraged to compile and execute this program. At this point, it is not important to completely understand the purpose of each line. There are however several aspects of this example that will help you with writing your own C code and with comprehending the subsequent examples in the text. We’ II focus on four such aspects: the function main, the code’s comments and programming style, preprocessor directives, and the I/0 function calls.
11.5.1 The Function main
The function main begins at the line containing int main () (line 17) and ends at the closing brace on the last line of the code. These lines of the source code constitute aJunction definition for the function named main. What were called subroutines in LC-3 assembly language programming (discussed in Chapter 9) are referred to as functions in C. Functions are a very important part of C, and we will devote all of Chapter 14 to them. In C, the function main serves a special purpose: It is where execution of the program begins. Every C program, therefore, requires a function main. Note that in ANSI C, main must be declared to return
11.5 A Simple Example 297
298
chapter 11 1 /*
Introduction to Programming in C
2
3 * 4*
5 *
6 * a positive number and counts down from that number to O,
7 * displaying each number along the way.
8* 9 *I
10
11 /* The next two lines are preprocessor directives*/
12 #include
13 #define STOP 0
14
15 /* Function main */
16 /* Description
17 int main()
18 { 19
20
21
22
23
24
25
26
27
28
29
30
31 }
Figure 11.3 A program prompts the user for a decimal integer and counts down from that number to 0
an integer value. That is, main must be of type int, thus line 17 of the code is int main().
In this example, the code for function main (i.e., the code in between the curly braces) can be broken down into two components. The first component contains the variable declarations for the function. Two variables, one called c o u n t e r and the other startPoint, are created for use within the function main. Variables are a very useful feature provided by high-level programming languages. They give us a way to symbolically name the values within a program.
The second component contains the statements of the function. These state- ments express the actions that will be performed when the function is executed. For all C programs, execution starts in main and progresses, statement by statement, until the last statement in main is completed.
In this example, the first grouping of statements (lines 24–26) displays a message and prompts the user to input an integer number. Once the user enters a number, the program enters the last statement, which is a for loop (a type of iteration construct that we will discuss in Chapter 13). The loop counts downward
*
Program Name Description
countdown, our first c program
This program prompts the user to type in
prompt for input, then display countdown*/
/* Variable declarations*/
int counter; int startPoint;
/* Prompt the user for input */
printf (11 ===== Countdown Program =====\n”); printf (11 Enter a positive integer: “); scanf (11 %d11 , &startPoint) ;
/* Holds intermediate count values*/ /* Starting point for count down */
/*Countdown from the input number to 0 */
for (counter= startPoint; counter>= STOP; counter–)
printf(“%d\n”, counter);
11.5 A Simple Example 299 from the number typed by the user to 0. For example, if the user entered the number
5, the program’s output would look as follows:
Countdown Program===== Enter a positive integer: 5
5
4
3
2
1
0
Notice in this example that many lines of the source code are terminated by semicolons, ; . In C, semicolons are used to terminate declarations and statements; they are necessary for the compiler to break the program down unambiguously into its constituents.
11.5.2 Formatting, Comments, and Style
C is a free-format language. That is, the amount of spacing between words and between lines within a program does not change the meaning of the program. The programmer is free to structure the program in whatever manner he/she sees fit while obeying the syntactic rules of C. Programmers use this freedom to format the code in a manner that makes it easier to read. In the example program, notice that the f o r loop is indented in such a manner that the statement being iterated is easier to identify. Also in the example, notice the use of blank lines to separate different regions of code in the function main. These blank lines are not necessary but are used to provide visual separation of the code. Often, statements that together accomplish a larger task are grouped together into a visually identifiable unit. The C code examples throughout this book use a conventional indentation style typical for C. Styles vary. Programmers sometimes use style as a means of expression. Feel free to define your own style, keeping in mind that the objective is to help convey the meaning of the program through its formatting.
Comments in Care different than in LC-3 assembly language. Comments in C begin with / * and end with * /. They can span multiple lines. Notice that this example program contains several lines ofcomments, some on a single line, some spanning multiple lines. Comments are expressed differently from one program- ming language to another. For example, comments in C++ can also begin with the sequence / / and extend to the end of the line. Regardless of how comments are expressed, the purpose is always the same: They provide a way for programmers to describe in human terms what their code does.
Proper commenting of code is an important part of the programming process. Good comments enhance code readability, allowing someone not familiar with the code to understand it more quickly. Since programming tasks often involve working in teams, code very often gets shared or borrowed between programmers. In order to work effectively on a programming team, or to write code that is worth sharing, you must adopt a good commenting style early on.
300
chapter 11 Introduction to Programming in C
One aspect of good commenting style is to provide information at the begin- ning of each source file that describes the code contained within it, the date it was last modified, and by whom. Furthermore, each function (see function main in the example) should have a brief description of what the function accomplishes, along with a description of its inputs and outputs. Also, com- ments are usually interspersed within the code to explain the intent of the various sections ot the code. But overcommenting can be detrimental as it can clutter up your code, making it harder to read. In particular, watch out for comments that provide no additional information beyond what is obvious from the code.
11.5.3 The C Preprocessor
We briefly mentioned the C preprocessor in Section 11.4.1. Recall that it trans- forms the original C program before it is handed off to the compiler. Our simple example contains two commonly used preprocessor directives: #define and #include. The C examples in this book rely only on these two directives.
The # d e f i n e directive is a simple yet powerful directive that instructs the C preprocessor to replace occurrences of any text that matches X with text Y. That is, the macro X gets substituted with Y. In the example, the # d e f i n e causes the text STOP to be substituted with the text o. So the following source line
for {counter= startPoint; counter>= STOP; counter–)
is transformed (internally, only between the preprocessor and compiler) into for (counter= startPoint; counter>= 0; counter–)
Why is this helpful? Often, the #define directive is used to create fixed values within a program. Following are several examples.
#define NUMBER OF STUDENTS 25 #define MAX LENGTH 80 #define LENGTH OF GAME 300 #define PRICE OF FUEL 1.49 #define COLOR_OF_EYES brown
So for example, we can symbolically refer to the price of fuel as PRICE_OF_FUEL. If the price of fuel were to change, we would simply modify the definition of the macro PRICE_OF_FUEL and the preprocessor would han- dle the actual substitution for us. This can be very convenient-if the cost of fuel was used heavily within a program, we would only need to modify one line in the source code to change the price throughout the code. Notice that the last example is slightly different from the others. In this example, one string of characters COLOR_OF_EYES is being substituted for another, brown. The common programming style is to use uppercase for the macro name.
The # i n c l u d e directive instructs the preprocessor literally to insert another file into the source file. Essentially, the # i n c l u d e directive itself is replaced by the contents o f another file. A t this point, the usefulness o f this command may not
be completely apparent to you, but as we progress deeper into the C language, you willunderstandhowCheaderfilescanbeusedtohold#defines anddeclarations that are useful among multiple source files.
For instance, all programs that use the C 1/0 functions must include the 1/0 library’s header file s t d i o . h. This file defines some relevant information about the I/O functions in the C library. The preprocessor directive, #include
There are two variations of the #include directive: #include cstdio.h>
#include uprogram.h’1
•The first variation uses angle brackets (< >) around the filename. This tells the preprocessor that the header file can be found in a predefined directory. This is usually determined by the configuration of the system and contains many system- related and library-related header files, such as s t d i o . h. Often we want to include headers files we have created ourselves for the particular program we are writing. The second variation, using double quotes (” “) around the filename, instructs the preprocessor that the header file can be found in the same directory as the C source file.
Notice that none of the preprocessor macros ends with a semicolon. Since #define and #include are preprocessor directives and not C statements, they are not required to be terminated by semicolons.
11.5.4 Input and Output
We close this chapter by pointing out how to perform input and output from within a C program. We describe these functions at a high level now and save the details for Chapter 18, when we have introduced enough background material to understand C 1/0 down to a low level. Since all useful programs perform some form of T/O, learning the T/O capabilities of C is an important first step. In C, 1/0 is performed by library functions, similar to the IN and OUT trap routines provided by the LC-3 system software.
Three lines of the example program perform output using the C library function p r i n t f or print formatted (refer to lines 24, 25, and 30). The func- tion p r i n t f performs output to the standard output device, which is typically the monitor. It requires afonnat string in which we provide two things: (1) text to print out and (2) specifications on how to print out values within that text. For example, the statement
printf(1143 is a prime number.11 );
prints out the following text to the output device. 43 is a prime number.
In addition to text, it is often useful to print out values generated within a program. Specifications within the format string indicate how we want these values to be printed out. Let’s examine a few examples.
11.5 A Simple Example 301
1·
302
chapter 11 Introduction to Programming in C
printf{“%d is a prime number. 11 , 43);
This first example contains the format specification %d in its format string. It causes the value listed after the format string to be embedded in the output as a decimal number in place of the %d. The resulting output would be
43 is a prime number.
The following examples show other variants o f p r i n t f .
printf( 11 43 plus 59 in decimal is %d.”, 43 + 59);
printf( 11 43 plus 59 in hexadecimal is %x. 11
printf( 11 43 plus 59 as a character is %c.”, 43 + 59);
In the first printf, the format specification causes the value 102 to be embedded in the text because the result of “43 + 59” is printed as a decimal number. In the next example, the format specification %x causes 66 (because 102 equals x66) to be embedded in the text. Similarly, in the third example, the format specification of %c displays the value interpreted as an ASCII char- acter which, in this case, would be lowercase f. The output of this statement would be
43 plus 59 as a character is f.
What is important to notice is that the binary pattern being supplied to printf aftertheformatstringisthesameforallthreestatements.Here,printf interprets the binary pattern 0110 0110 (decimal 102) first as a decimal number, then as a hexadecimal number, and finally as an ASCII character. The C output function printf converts the bit pattern into the proper sequence of ASCII characters based on the format sepecifications we provide it. Table D.6 contains a list of all the format specifications that can be used with printf. All format specifications begin with the percent sign, %•
The final example demonstrates a very common and powerful use ofprintf. printf (“The wind speed is %d km/hr. 11 , windSpeed);
Here, a value generated during the execution of the program, in this case the variable windspeed, is output as a decimal number. The value displayed depends on the value of windSpeed when this line of code is executed. So if windspeed equals 2 when the statement containing p r i n t f is executed, the following output would result:
The wind speed is 2 km/hr.
I f you were to execute a program containing the five preceding p r i n t f state- ments in these examples, you would notice that they would all be displayed on one single line without any line breaks. I f we want line breaks to appear, we must put them explicitly within the format string in the places we want them to occur. New lines, tabs, and other special characters require the use of a special backslash (\) sequence. For example, to print a new line character (and thus cause a line break),
1
43 + 59);
we use the special sequence \n. We can rewrite the preceding p rin tf statements as such:
printf(11%d is a prime number.\n11
printf(“43 plus 59 in decimal is %d.\n”, 43 + 59); printf( 11 43 plus 59 in hexadecimal is %x.\n”, 43 + 59);
43) ;
printf( 11 43 plus 59 as a character is %c.\n11
printf (11 The wind speed is %d km/hr. \n11
Notice that each format string ends by printing the new line character \ n, so thereforeeachsubsequentprintf willbeginonanewline.TableD.l containsa list of other special characters that are useful when generating output. The output generated by these five statements would look as follows:
43 is a prime number.
43 plus 59 in decimal is 102.
43 plus 59 in hexadecimal is 66. 43 plus 59 as a character is f. The wind speed is 2 km/hr.
In our sample program in Figure 11.3, printf appears three times in the source. The first two versions display only text and no values (thus, they have no format specifications). The third version prints out the value o f variable c o u n t e r . Generally speaking, we can display as many values as we like within a single p r i n t f . The number o f format specifications (for example, %d) must equal the number of values that follow the format string.
Question: What happens if we replace the third p r i n t £ in the example pro- gram with the following? The expression “startPoint – counter” calculates the value of startPoint minus the value of counter.
printf( 11 %d %d\n11
,
counter, startPoint – counter);
•
Having dealt with output, we now tum to the corresponding input func- tion scan£. The function scanf performs input from the standard input device, which is typically the keyboard. It requires a format string (similar to the one required by p r i n t f ) and a list o f variables into which the values retrieved from the keyboard should be stored. The function scanf reads input from the key- board and, according to the conversion characters in the format string, converts the input and assigns the converted values to the variables listed. Let’s look at an example.
In the example program in Figure 11.3, we use scanf to read in a single decimal number using the format specification %d. Recall from our discussion on LC-3 keyboard input, the value received via the keyboard is in ASCII. The format specification %d informs s c a n f to expect a sequence of numeric ASCII keystrokes (i.e., the digits Oto 9). This sequence is interpreted as a decimal number and converted into an integer. The resulting binary pattern will be stored in the
,
,
, windSpeed);
11.5 A Simple Example
303
43 + 59);
304
chapter 11 Introduction to Programming in C
variable called startPoint. The function scanf automatically performs type conversions (in this case, from ASCII to integer) for us! The format specification %dis one of several that can be used with scanf. Table D.5 lists them all. There are specifications to read in a single character, a floating point value, an integer expressed as a hexadecimal value, and so forth.
A very important thing to remember about scanf is that variables that are being modified by the scanf function (for example, startPoint) must be pre- ceded by an &character. This may seem a bit mysterious, but we will discuss the reason for this strange notation in Chapter 16.
Following are several more examples of scanf.
/* Reads in a character and stores it in nextChar */
scanf(11%c11 ,
/* Reads in a floating point number into radius*/
scanf(11%f11 ,
&radius);
&nextChar);
/* Reads two decimal numbers into length and width*/ scanf ( “%d %d”, &length, &width);
11.6 Summar~
In this chapter, we have introduced some key characteristics of high-level pro- gramming languages and provided an initial exposure to the C programming language. We conclude this chapter with a listing of the major topics we’ve covered.
• High-Level Programming Languages. High-level languages aim to make the programming process easier by connecting real-world objects with the low- level concepts, such as bits and operations on bits, that a computer natively deals with. Because computers can only execute machine code, programs in high-level languages must be translated using the process of compilation or interpretation into machine code.
• The C Programming Language. The C programming language is an ideal language for a bottom-up exposure to computing because of its low-level nature and because of its root influence on current popular programming languages. The C compilation process involves a preprocessor, a compiler, and a linker.
• Our First C Program. We provided a very simple program to illustrate several basic features of C programs. Comments, indentation, and style can help convey the meaning of a program to someone trying to understand the code. Many C programs use the preprocessor macros #define and #include. The execution of a C program begins at the function main, which itself consists of variable declarations and statements. Finally, 1/0 in C can be accomplished using the library functions printf and scanf.
11.1 Describe some problems or inconveniences you found when programming in lower-level languages.
11.2 How do higher-level languages help reduce the tedium of programming in lower-level languages?
11.3 What are some disadvantages to programming in a higher-level language?
11.4 Compare and contrast the execution process of an interpreter versus the execution process of a compiled binary. What implication does interpretation have on performance?
11.5 A language is portable if its code can run on different computer systems, say with different IS As. What makes interpreted languages more portable than compiled languages?
11.6 The UNIX command line shell is an interpreter. Why can’t it be a compiler?
11.7 Is the LC-3 simulator a compiler or an interpreter?
11.8 Another advantage of compilation over interpretation is that a compiler can optimize code more thoroughly. Since a compiler can examine the entire program when generating machine code, it can reduce the amount of computation by analyzing what the program is trying to do.
The following algorithm performs some very straightforward arithmetic based on values typed at the keyboard. It outputs a single result.
1.
2.
3. 4. 5.
a.
b.
Get W from the keyboard
x-w+w Y-x+x z-y+Y
Print Z to the screen
An interpreter would execute the program statement by statement. In total, five statements would execute. At least how many arithmetic operations would the interpreter perform on behalf of this program? State what the operations would be.
A compiler would analyze the entire program before generating machine code, and possibly optimize the code. lf the underlying ISA were capable of all arithmetic operations (i.e., addition, subtraction, multiplication, division), at least how many operations would be needed to carry out this program? State what the operations
would be.
11.9 For this question refer to Figure 11.2.
a. Describe the input to the C preprocessor.
b. Describe the input to the C compiler.
c. Describe the input to the linkerc
Exercises 305
30& chapter 11 Introduction to Programming in C
11.10 What happens if we changed the second-to-last line of the program in
Figure 11.3 fromprintf (“%d\n”, counter); to:
a. printf( 11 %c\n11 , counter+ ‘A ‘);
b. printf (“%d\n%d\n”, counter, startPoint + counter); c. printf (“%x\n11 , counter) i
11.11 The function scanf reads in a character from the keyboard and the function printf prints it out. What do the following two statements accomplish?
scanf (11 %c’1 , &nextChar);
printf(11 %d\n11
,
nextChar);
11.12 The following lines of C code appear in a program. What will be the outputofeachprintf statement?
#define LETTER ‘l’ #define ZERO 0 #define NUMBER 123
‘a’);
11.13 Describe a program (at this point we do not expect you to be able to write working C code) that reads a decimal number from the keyboard and prints out its hexadecimal equivalent.
printf(11 %c11 ,
printf( 1’x%x•1 , 12288);
printf(“$%d.%c%d\n”, NUMBER, LETTER, ZERO);
Variables and Operators
12.l Introduction
In this chapter, we cover two basic concepts of high-level language programming, variables and operators. Variables hold the values upon which a program acts, and operators are the language mechanisms for manipulating these values. Variables and operators together allow the programmer to more easily express the work that a program is to carry out.
The following line of C code is a statement that involves both variables and operators. In this statement, the addition operator+ is used to add 3 to the original value of the variable score. This new value is then assigned using the assignment operator – back to score. If score was equal to 7 before this statement was executed, it would equal 10 afterwards.
score= score+ 3;
In the first part of this chapter, we’ II take a closer look at variables in the C programming language. Variables in C are straightforward: the three most basic flavors are integers, characters, and floating point numbers. After variables, we’ll cover C’s rich set of operators, providing plenty of examples to help illustrate their operations. One unique feature of our approach is that we can connect both of these high-level concepts back to solid low-level material, and in the third part of the chapter we’ll do just that by discussing the compiler’s point of view when it tries to deal with variables and operators in generating machine code. We close this chapter with some problem solving and some miscellaneous concepts involving variables and operators in C.
ch~pter
12
308
chapter 12 Variables and Operators
12.2 Variables
A value is any data item upon which a program performs an operation. Examples of values include the iteration counter for a loop, an input value entered by a user, or the partial sum of a series of numbers that are being added together. Programmers spend a lot of effort keeping track of these values.
Because values are such an important programming concept, high-level lan- guages try to make the process of managing them easier on the programmer. High-level languages allow the programmer to refer to values symbolically, by a name rather than a memory location. And whenever we want to operate on the value, the language will automatically generate the proper sequence ofdata move- ment operations. The programmer can then focus on writing the program and need not worry about where in memory to store a value or about juggling the value between memory and the registers. ln high-level languages, these symbolically named values are called variables.
In order to properly track the variables in a program, the high-level language translator (the C compiler, for instance) needs to know several characteristics about each variable. It needs to know, obviously, the symbolic name ot the vari- able. It needs to know what type of information the variable will contain. It needs to know where in the program the variable will be accessible. In most languages, C included, this information is provided by the variable’s declaration.
Let’s look at an example. The following declares a variable called echo that will contain an integer value.
int echo;
Based on this declaration, the compiler reserves an integer’s worth of memory for echo (sometimes, the compiler can optimize the program such that echo is stored in a register and therefore does not require a memory location, but that is a subject for a later course). Whenever echo is referred to in the subsequent C code, the compiler generates the appropriate machine code to access it.
12.2.1 Three Basic Data Types: int, char, double
By now, you should be very familiar with the following concept: the meaning ot a particular bit pattern depends on the data type imposed on the pattern. For example,thebinarypatternoi 1o o11omightrepresentthelowercasef oritmight represent the decimal number 102, depending on whether we treat the pattern as an ASCII data type or as a 2’s complement integer data type. A variable’s declaration informs the compiler about the variable’s type. The compiler uses a variable’s type information to allocate a proper amount of storage for the variable. Also, type indicates how operations on the variable are to be performed at the machine level. For instance, performing an addition on two integer variables can be done on the LC-3 with one ADD instruction. If the two variables were of floating point type, the LC-3 compiler would generate a sequence of instructions to perform the addition because no single LC-3 instruction performs a floating point addition.
C supports three basic data types: integers, characters, and floating point numbers. Variables of these types can be created with the type specifiers int, char, and double (which is short for double-precision floating point).
int
The int type specifier declares a signed integer variable. The internal represen- tation and range of values of an int depends on the ISA of the computer and the specifics of the compiler being used. In the LC-3, for example, an int is a 16-bit 2’s complement integer that can represent numbers between -32,768 and +32,767. On an x86-based computer, an int is likely to be a 32-bit 2’s complement number that can represent numbers between -2,147,483,648 and +2, 147,483,647. In most cases, an int is a 2’s complement integer in the word length of the underlying ISA.
The following line of code declares an integer variable called numberofseconds. When the compiler sees this declaration, the compiler sets aside enough storage for this variable (in the case of the LC-3, one memory location).
int numberOfSeconds;
It should be no surprise that variables of integer type are frequently used in programs. They often conveniently represent the real-world data we want our programs to process. If we wanted to represent time, say for example in seconds, an integer variable would be perfect. In an application that tracks whale migration, we can use an integer to represent the sizes of pods of gray whales seen off the California coast. Integers are also useful for program control. An integer can be useful as the iteration counter for a counter-controlled loop.
char
The char type specifier declares a variable whose data value represents a char- acter. Following are two examples. The first declaration creates a variable named lock. The second one declares key. The second declaration is slightly different; it also contains an initializer. In C, any variables can be set to an initial value directly in its declaration. In this example, the variable key will have the initial value of the ASCII code for uppercase Q. Also notice that the uppercase Q is surrounded by single quotes, ‘ ‘ . In C, characters that are to be interpreted as ASCII literals are surrounded by single quotes. What about lock? What initial value will it have? We’ll address this issue shortly.
char lock;
char key= ‘Q ‘;
Although eight bits are sufficient to hold an ASCII character, for purposes of making the examples in this textbook less cluttered, all char variables will occupy 16 bits. That is, chars, like ints, will each occupy one memory location.
double
The type specifier double allows us to declare variables of the floating point type that we examined in Section 2.7.2. Floating point numbers allow us to
12.2 Variables 309
310 chapter 12 Variables and Operators
conveniently deal with numbers that have fractional components or numbers that are very large or very small. Recall from our previous discussion in Section 2.7.2 that at the lowest level, a floating point number is a bit pattern that has three parts: a sign, a fraction, and an exponent.
Here arc three examples of variables of type double:
double costPerLiter; double electronsPerSecond; double averageTemp;
As with ints and chars, we can also optionally initialize a floating point number along with its declaration. Before we can completely describe how to initialize floating point variables, we must first discuss how to represent floating point literals in C. Floating point literals are represented containing either a decimal point or an exponent, or both, as demonstrated in the exam- ple code that follows. The exponent is signified by the character e or E and can be positive or negative. It represents the power of 10 by which the fractional part (the part that precedes the e or E) is multiplied. Note that the exponent must be an integer value. For more information on floating point literals, see Appendix D.2.4.
double twoPointone = 2.1; double twoHundredTen = 2.1E2; double twoHundred = 2E2; double twoTenths = 2E-1; double minusTwoTenths = -2E-1;
/* This is 2.1 */ /* This is 210.0 */ I* This is 200.0 */ /* This is 0.2 */ I* This is -0.2 */
Another floating point type specifier in C is called float. It declares a single- precision floating point variable; double creates one that is double-precision. Recall from our previous discussion on floating point numbers in Chapter 2 that the precision of a floating point number depends on the number of bits of the representation allocated to the fraction. In C, depending on the compiler and the ISA, a double may have more bits allocated for the fraction than a float, but never fewer. The size of the double is dependent upon the ISA and the compiler. Usually, a double is 64 bits long and a float is 32 bits in compliance with the IEEE 754 floating point standard.
12.2.2 Choosing Identifiers
Most high-level languages have flexible rules for the variable names (more gen- erally known as identifiers) that can be chosen within a program. C allows you to create identifiers composed of letters of the alphabet, digits, and the underscore character, ~· Only letters and the underscore character, however, can be used to begin an identifier. An identifier can be of any length, but only the first 31 characters are used by the C compiler to differentiate variables- only the first 31 characters matter to the compiler. Also, the use of upper- and lowercase has significance: C will treat Capital and capital as different indentifiers.
Here are several tips on standard C naming conventions: Variables beginning w i t h a n u n d e r s c o r e ( e . g . , _ i n d e x __ ) c o n v e n t i o n a l l y a r e u s e d o n l y i n s p e c i a l l i b r a r y code. Variables are almost never declared in all uppercase letters. The convention of all uppercase is used solely for symbolic values created using the preproces- sor directive #define. See Section 11.5.3 for examples of symbolic constants. Programmers like to visually partition variables that consist of multiple words. In this book, we use uppercase (e.g., wordsPerSecond). Other programmers prefer underscores (e.g., words__per_ second).
Giving variables meaningful names is important for writing good code. Vari- able names should be chosen to reflect a characteristic o f the value they represent, allowing the programmer to more easily recall what the value is used for. For example, a value used to count the number of words the person at the keyboard types per second might be named wordsPerSecond.
There are certain keywords in C that have special meaning and are therefore restricted from being used as identifiers. A list of C keywords can be found in Appendix D.2.6. One keyword we have encountered already is i n t , and therefore we cannot use i n t as a variable name. Having a variable named i n t would not only be confusing to someone trying to read through the code but might also confuse the compiler trying to translate it. The compiler may not be able to determine whether a particular i n t refers to the variable or to the type specifier.
12.2.3 Scope: Local versus Global
As we mentioned, a variable’s declaration assists the compiler in managing the storage of that variable. In C, a variable’s declaration conveys three pieces of information to the compiler: the variable’s identifier, its type, and its scope. The first two of these, identifier and type, the C compiler gets explicitly from the variable’s declaration. The third piece, scope, the compiler infers from the position of the declaration within the code. The scope of a variable is the region of the program in which the variable is “alive” and accessible.
The good news is that in C, there are only two basic types of scope for a variable. Either the variable is global to the entire program,1 or it is local, or private, to a particular block of code.
Local Variables
In C, all variables must be declared before they can be used. In fact, some variables must be declared at the beginning of the block in which they appear-these are called local variables. In C, a block is any subsection of a program beginning with the open brace character, { and ending with the closing brace character, }. All local variables must be declared immediately following the block’s open brace.
The following code is a simple C program that gets a number from the key- board and redisplays it on the screen. The integer variable e c h o is declared within
1 This is a slight simplification because Callows globals to be optionally declared to be global only to a particular source file and not the entire program, but this caveat is not relevant for our discussion here.
12.2 Variables 311
,! !’.
312 chapter 12 Variables and Operators
the block that contains the code for function main. It is only visible to the func- tion main. If the program contained any other functions besides main, the variable would not be accessible from those other functions. Typically, most local vari- ables are declared at the beginning of the function in which they are used, as for example echo in the code.
#include
int main() {
int echo;
scanf (n %dn , &echo) ;
printf (11 %d\n11 , echo);
It is possible, and sometimes useful, to declare two different variables with the same name within different blocks of the same function. For instance, it might be convenient to use the name c o u n t for the counter variable for several different loops within the same program. C allows this, as long as the different variables sharing the same name are declared in seperate blocks. Figure 12.1, which we discuss iu the next section, provides an example of this.
Global Variables
In contrast to local variables, which can only be accessed within the block in which they are declared, global variables can be accessed throughout the program. They retain their storage and values throughout the duration of the program.
#include
int mainI)
{
/* This variable is global*/
int localVar 3;
printf(“Global %d Local %d\n”, globalVar, localVar);
/* This variable is local to main*/
int localVar = 4;
printf(“Global %d Local %d\n”, globalVar, localVar);
printf (11 Global %d Local %-d\n11 , globalVar, localVar); Figure 12.1 A C program that demonstrates nested scope
/* Local to this sub-block*/
The following code contains both a global variable and a variable local to the function main:
#include
int main()
{
}
Globals can be extremely helpful in certain programming situations, but novice programmers are often instructed to adopt a programming style that uses locals over globals. Because global variables are public and can be modified from anywhere within the code, the heavy use of globals can make your code more vulnerable to bugs and more difficult to reuse and modify. In almost all C code examples in this textbook, we use only local variables.
Let’s look at a slightly more complex example. The C program in Figure 12.1 is similar to the previous program except we have added a sub-block within main. Within this sub-block, we have declared a new variable localVar. It has the same name as the local variable declared at the beginning of main. Execute this program and you will notice that when the sub-block is executing the prior version of localvar is not visible; that is, the new declaration of a variable of the same name supersedes the previous one. Once the sub-block is done executing, the previous version of localVar becomes visible again. This is an example of what is called nested scope.
Initialization of Variables
Now that we have discussed global and local variables, let’s answer the question we asked earlier: What initial value will a variable have if it has no initializer? In C, by default, local variables start with an unknown value. That is, the storage location a local variable is assigned is not cleared and thus contains whatever last value was stored there. More generally, in C, local variables are uninitialized (in particular, all variables of the automatic storage class). Global variables (and all other static storage class variables) are, in contrast, initialized to Owhen the program starts execution.
12.2.4 More Examples
Let’s examine a couple more examples of variable declarations in C. The fol- lowing examples demonstrate declarations of the three basic types discussed in this chapter. Some declarations have no initializers; some do. Notice how floating point and character literals are expressed in C.
/* This variable is global*/
/* This variable is local to main*/
int localVar 3;
printf(“Global %d Local %d\n”, globalVar, localVar);
12.2 Variables
313
r’
1′ ‘
314
chapter 12 Variables and Operators
double width;
double pType = 9.44; double mass= 6.34E2; double verySmallAmount double veryLargeAmount int average= 12;
int windChillindex = -21; in t unknownV alue;
int mysteryAmount;
char car ‘A’;
char number= ‘4’;
9 .1094E-31; 7.334553El02;
In C, it is also possible to have literals that are hexadecimal values. A literal that has the prefix ox will be treated as a hexadecimal number. In the following examples, all three integer variables are initialized using hexadecimal literals.
int programCounter = Ox.3000; int sevenBits = Ox.Al234;
int valueD = OxD;
Questions:Whathappensifweperformaprintf (“%d\n”, valueD); after the declarations? What bit pattern would you expect to find in the memory location associated with valueD?
•
12.3 Operators
Having covered the basics of variables in C, we are now ready to investigate oper- ators. C, like many other high-level languages, supports a rich set of operators that allow the programmer to manipulate variables. Some operators perform arith- metic, some perform logic functions, and others perform comparisons between values. These operators allow the programmer to express a computation in a more natural, convenient, and compact way than by expressing it as a sequence of assembly language instructions.
Given some C code, the compiler’s job is to take the code and convert it into machine code that the underlying hardware can execute. In the case ofa C program being compiled for the LC-3, the compiler must translate whatever operations the program might contain into the instructions of the LC-3 instruction set–clearly not an easy task given that the LC-3 has very few operate instructions.
To help illustrate this point, we examine the code generated by a simple C statement in which two integers are multiplied together. In the following code segment, x, y, and z are integer variables where x and y are multiplied and the result assigned to z.
Z = X * y;
Since there is no single LC-3 instruction to multiply two values, our LC-3 compiler must generate a sequence ofcode that accomplishes the multiplication of
AND RO, RO, #0
LDR Rl, RS, #0 LDR R2, RS, #-1 BRz DONE
BRp LOOP
NOT Rl, Rl
ADD Rl, Rl, #1
NOT R2, R2
ADD R2, R2, #1
ADD RO, RO, Rl ADD R2, R2, #-1 BRp LOOP
two (possibly negative) integers. One possible manner in which this can be accom- plished is by repeatedly adding the value of x to itself a total of y times. This code is similar to the code in the calculator example in Chapter 10. Figure 12.2 lists the resulting LC-3 code generated by the LC-3 compiler. Assume that register S (RS) contains the memory address where variable x is allocated. Immediately prior to that location is where variable y is allocated (i.e., RS – I), and imme- diately prior to that is where variable z resides. While this method of allocating variables in memory might seem a little strange at first, we will explain this later in Section 12.5.2.
12.3.1 Expressions and Statements
Before proceeding with our coverage of operators, we’ll diverge a little into C syntax to help clarify some syntactic notations used within C programs. We can combine variables and literal values with operators, such as the multiply operator from the previous example, to form a C expression. In the previous example, x * y is an expression.
Expressions can be grouped together to form a statement. For example, z = x • y; is a statement. Statements in C are like complete sentences in English. Just as a sentence captures a complete thought or action, a C state- ment expresses a complete unit of work to be carried out by the computer. All statements in C end with a semicolon character, ; (or as we’ll see in the next paragraph, a closing brace, }). The semicolon terminates the end of a statement in much the same way a punctuation mark terminates a sentence in English. An interesting (or perhaps odd) feature of C is that it is possible to create statements that do not express any computation but are syntactically considered statements.
The null statement is simply a semicolon and it accomplishes nothing.
LOOP
DONE:
Figure 12.2 The LC-3 code for C multiplication
STR RO, RS, #-2
z =x *y;
RO <= 0
load value of X
load value of y
if y is zero, we1 re done
if y is positive, start mult
y is negative Rl <= -x
R2 <= -y (-y is positive) Multiply loop
The result is in R2
12.3 Operators 315
316 chapter 12 Variables and Operators
One or more simple statements can be grouped together to form a compound statement, or block, by enclosing the simple statements within braces, { }. Syn- tactically, compound statements are equivalent to simple statements. We will see many real uses of compound statements in the next chapter.
The following examples show some simple, compound, and null statements.
z = X * y; /* This statement accomplishes some work */ /* This is a compound statement */
ab+c· ip*r*t;
k k + 1; /* This is another simple statement */ /* Null statement - - no work done here */
12.3.2 The Assignment Operator
We've already seen examples of C's assignment operator. Its symbol is the equal sign, =. The operator works by first evaluating the right-hand side of the assign- ment, and then assigning the value of the right-hand side to the object on the left-hand side. For example, in the C statement
a=b+c;
the value of variable a will be set equal to the value of the expression b + c. Notice that even though the arithmetic symbol for equality is the same as the C symbol for assignment, they have different meanings. In mathematics, by using the equal sign, =, one is making the assertion that the right-hand and left- hand expressions are equivalent. In C, using the = operator causes the compiler to generate code that will make the left-hand side change its value to equal the value
of the right-hand side. In other words, the left-hand side is assigned the value of the right-hand side.
Let's examine what happens when the LC-3 C compiler generates code for a statement containing the assignment operator. The C following statement represents the increment by 4 of the integer variable x.
X = X + 4;
The LC-3 code for this statement is straightforward. Here, RS contains the address
of variable x.
LDR RO, RS, #0 ADD RO, RO, #4 STR RO, RS, #0
Get the value of x calculate x + 4
X = X + 4;
In C, all expressions evaluate to a value of a particular type. From the pre- vious example, the expression x + 4 evaluates to an integral value because we
are adding an integer 4 to another integer (the variable x). This integer result is then assigned to an integer variable. But what would happen ifwe constructed an expression ofmixed type, for example x + 4 . 3? The general rule in C is that the mixed expressions like the one shown will be converted from integer to floating point. If an expression contains both integer and character types, it will be pro- moted to integer type. In general, in C shorter types are converted to longer types. What if we tried to assign an expression of one type to a variable of another, for example x = x + 4 . 3? In C, the type of a variable remains immutable (meaning it cannot be changed), so the expression is converted to the type of the variable. In this case, the floating point expression x + 4 . 3 is converted to integer. In C, floating point values are rounded into integers by dropping the fractional part. For example, 4.3 will be rounded to 4 when converting from a floating point into an integer; 5.9 will be rounded to 5.
12.3.3 Arithmetic Operators
'I'he arithmetic operators are easy to understand. Many of the operations and corresponding symbols are ones to which we are accustomed, having used them since learning arithmetic in grade school. For instance, + performs addi- tion, - subtraction,* performs multiplication (which is different from the symbol we are accustomed to for multiplication in order to avoid confusion with the letter x), and / performs division. Just as when doing arithmetic by hand, there is an order in which expressions are evaluated. Multiplication and division are evalu- ated first, followed by addition and subtraction. The order in which operators are evaluated is called precedence, and we discuss it in more detail in the next section. Following are several C statements formed using the arithmetic operators:
distance~ rate* time;
netincome = income - taxesPaid;
fuelEconomy = milesTraveled / fuelConsumed; area= 3.14159 *radius* radius;
y = a*x*x + b*x + c;
Chas another arithmetic operator that might not be as familiar to you as +, - , *, and /. It is the modulus operator, % ( also known as the integer remainder operator). To illustrate its operation, consider what happens when we divide two integer values. When performing an integer divide in C, the fractional part is dropped and the integral part is the result. The expression 11 / 4 evaluates to 2. The modulus operator % can be used to calculate the integer remainder. For
example, 11 % 4 evaluates to 3. Said another way, (11 / 4) * 4 + (11 % 4) is equal to 11. In the following example, all variables are integers.
quotient X I y; I* if X 7 and y 2, quotient= 3 •/ remainder X %y; /* if X 7 and y 2, remainder= 1 */
Table 12.1 lists all the arithmetic operations and their symbols. Multiplication, division, and modulus have higher precedence than addition and subtraction.
12.3 Operators 317
'I (' fl fl
f
ti ,,
!i
Operator symbol •
I
% +
12.3.4 Order of Evaluation
Operation
multiplication division modulus addition subtraction
Example usage X • y
X I y X %y X + y X - y
,,
11
I1, II
Before proceeding onwards to the next set of C operators, we diverge momen- tarily to answer an important question: What value is stored in x as a result of the following statement?
X = 2 + 3 * 4;
Precedence
Just as when doing arithmetic by hand, there is an order to which expressions are evaluated. And this order is called operator precedence. For instance, when doing arithmetic, multiplication and division have higher precedence than addition and subtraction. For the arithmetic operators, the C precedence rules are the same as we were taught in grade-school arithmetic. In the preceding statement, x is assigned the value 14 because the multiplication operator has higher precedence than addition. That is, the expression evaluates as if it were 2 + (3 *4).
Associativity
But what about operators of equal precedence? What does the following statement evaluate to?
X ~ 2 + 3 - 4 + 5;
Depending on which operator we evaluate first, the value of the expression 2 +3 - 4 +5 could equal 6 or it could equal -4. Since the precedence of both operators is the same (that is, addition has the same precedence as subtrac- tion in C), we clearly need a rule on how such expressions should be evaluated in C. For operations of equal precedence, their associativity determines the order in which they are evaluated. In the case of addition and subtraction, both associate from left to right. Therefore 2 + 3 - 4 + 5 evaluates as if it were ((2+3)- 4)+5.
The complete set of precedence and associativity rules for all operators in C is provided in Table 12.5 at the end of this chapter and also in Table D.4. We suggest that you do not try to memorize this table (unless you enjoy quoting C trivia to your friends). Instead, it is important to realize that the precedence rules exist and to roughly comprehend the logic behind them. You can always refer to the table whenever you need to know the relationship between particular operators. There is a safeguard, however: parentheses.
318
chapter 12
Variables and Operators
Parentheses
Parentheses override the evaluation rules by specifying explicitly which opera- tions are to be performed ahead of others. As in arithmetic, evaluation always begins at the innermost set of parentheses. We can surround a subexpression with parentheses if we want that subexpression to be evaluated first. So in the following example, say the variables a, b, c, and ct are all equal to 4. The statement
X =a* b + C * d / 2;
could be written equivalently as x~ (a*b)+(ic*d)/4);
For both statements, x is set to the value of 20. Here the program will always evaluate the innermost subexpression first and move outward before falling back on the precedence rules.
What value would the following expression evaluate to if a, b, c, and ct equal 4?
x ~a* (b + c) * d / 4;
Parentheses can help make code more readable, too. Most people reading your code are unlikely to have memorized C's precedence rules. For this reason, for long or complex expressions, it is often stylistically preferable to use parentheses, even if the code works fine without them.
12.3.5 Bitwise Operators
We now return to our discussion of C operators. C has a set of operators called bitwise operators that manipulate bits of a value. That is, they perform a logical operation such as AND, OR, NOT, XOR across the individual bits of a value. For example, the C bitwise operator & performs an operation similar to the LC-3 AND instruction. That is, the & operator performs an AND operation bit by bit across the two input operands. The C operator I performs a bitwise OR. The operator - performs a bitwise NOT and takes only one operand (i.e., it is a unary operator). The operator A performs a bitwise XOR. Examples of expressions using these operators on 16-bit values follow.
•
Oxl234 I Ox5678 Oxl234 & Ox5678
A
Oxl234 -Oxl234 1234 & 5678
/* equals Ox567C */ I* equals Oxl230 */ /* equals Ox444C *I I* equals OxEDCB */ /* equals 1026 */
Ox5678
C's set of bitwise operators includes two shift operators: « , which performs a left shift, and >>,which performs a right shift. Both are binary operators, meaning they require two operands. The first operand is the value to be shifted and the second operand indicates the number of bit positions to shift by. On a left shift, the vacated bit positions of the value are filled with zeros; on a right shift, the value is sign-extended. The result is the value of the expression; neither of the
12.3 Operators
319
320
ch~pter 12
Variables a~nd Operators Operator symbol
« » &
•
Question: Say that on a particular machine, the integer x occupies 16 bits and has the value I. What happens after the statement x ~ x « 16; is executed? Conceptually, we are shifting x by its data width, replacing all bits with 0. You might expect the value of x to be 0. To remain generic, C formally defines the result of shifting a value by its width (or more than its data width) as implementation- dependent. This means that the result might be 0 or it might not, depending on the system on which the code is executed.
Table 12.2 lists all the bitwise operations and their symbols. The operators are listed in order of precedence, the NOT operator having highest precedence, and the left and right shift operators having equal precedence, followed by AND, then XOR, then OR. They all associate from left to right. See Table 12.5 for a complete Iisting of operator precedence.
12.3.6 Relational Operators
C has several operators to test the relationship between two values. As we will see in the next chapter, these operators are often used in C to generate conditional
Oxl234 « 3 Oxl234 >> 2 1234 << 3 1234 >> 2 Ox1234 << 5 OxFEDC >> 3
/* equals Ox91AO
/* equals Ox048D
I* equals 9872 */
/* equals 308
I* equals Ox4680 (result is 16 bits) */ I* equals OxFFDB (from sign-extension) *I
Operation
bitwise NOT left shift rightshift bitwiseAND bitwise XOR bitwise OR
Example usage
-x X«y X»y X & y Xy XIy
two original operand values are modified. The following expressions provide examples of these two operators operating on 16-bit integers.
Here we show several C statements formed using the bitwise operators. For all of C’s bitwise operators, neither operand can be a floating point value. For these statements, f, g, and hare integers.
h f & g; I* if f 7′ g 8, h will equal 0 */ hfIg; /*iff7,g8,hwillequal15*/ h f << 1; /* if f 7, g 8, h will equal 14 */
h g« f· /*iff 7,g 8,hwillequal1024 */ '
h -f I -g; I* if f 7' g 8, h will equal -1 */
/* because h is a signed integer
*/
Operators in C
*/
*/ */
~ n a l OperatorsinC
Operator symbol >
>= < <=
! =
Operation
greaterthan
greater than or equal lessthan
less than or equal equal
not equal
Example usage X >y
X >= y X
Table 12.4 Operator symbol
&& 11
Table 12.5
Descriptions of Some Operators are Provided in Parentheses
Operation
logical NOT logicalAND logicalOR
Example usage !x
X &&y X 11 y
.for Precedence, from Highest to Lowest.
Itor ()
rtoI ++
rtoI ++
rtoI * (indirection) & (address of)
+ (unary) – (unary) rtoI Itype I (type cast)
Itor * (multiplication) / % Itor + (addition) – (subtraction) Itor «»
Itor <><=>::=
(function call) [ ] (array index) – – (postfix versions)
– – (prefix versions)
– >
Itor ? : (conditional expression) += *= etc.
operators to perform them. The ++operator increments a variable to the next higher value. The – – operator decrements it. For example, the expression x ++ increments the value of integer variable x by 1. The expression x- – decrements the value of x by 1. Keep in mind that these operators modify the value of the variable itself. That is, x++ is similar to the operation x – x + 1.
The ++ and – – operators can be used on either side of a variable. The expres- sion ++x operates in a slightly different order than x++. If x++ is part of a larger expression, then the value of x ++ is the value of x prior to the increment, whereas the value of++xis the incremented value ofx. Ifthe operator++ appears before the variable, then it is used in prefix form. If it appears after the variable, it is in postfix form. The prefix forms are often referred to as preincrement and predecrement, whereas the postfix are postincrement and postdecrement.
Let’s examine a couple of examples:
X 4;
y x++;
Here, the integer variable x is incremented. However, the original value of x is assigned to the variable y (i.e., the value of x++ evaluates to the original
sizeof
12.3 Operators 323
I!t ,,
,,
324
chapter 12 Variables and Operators
value of x). After this code executes, the variable y will have the value 4, and x will be 5.
Similarly, the following code increments x.
X =4;
y = –.-+x;
However with this code, the expression ++x evaluates to the value after the increment. In this case, the value of both y and x will be 5.
This subtle distinction between the postfix and prefix forms is not too impor- tant to understand for now. For the few examples in this book that use these oper- ators, the prefix and postfix forms of these operators can be used interchangeably. You can find a precise description of this difference in Appendix D.5.6.
12.3.9 Expressions with Multiple Operators
Thus far we’ve only seen examples of expressions with one or two operators. Real and useful expressions sometimes have more. We can combine various operators and operands to form complex expressions. The following example demonstrates a peculiar blend of operators forming a complex expression.
y = x & z + 3 11 9 – w % 6;
In order to figure out what this statement evaluates to, we need to examine the order of evaluation of operators. Table 12.5 lists all the C operators (including some that we have not yet covered but will cover later in this textbook) and their order of evaluation. According to precedence rules, this statement is equivalent to the following:
y = ix & (z + 3) I 11 (9 – (w % 6) I;
Another more useful expression that consists of multiple operators is given in the example that follows. In this example, if the value of the variable age is between 18 and 25, the expression evaluates to 1. Otherwise it is 0. Notice that even though the parentheses are not required to make the expression evaluate as we described, they do help make the code easier to read.
(18 <= age) && (age<= 25)
12.4 ProblemSolvingUsingOperators
At this point, we have covered enough C operators to attempt a simple problem- solving exercise. For this problem, we will create a program that performs a simple network calculation: It calculates the amount of time required to transfer some number of bytes across a network with a particular transfer rate (provided in bytes per second). The twist to this problem is that transfer time is to be displayed as hours, minutes, and seconds.
We approach this problem by applying the decomposition techniques described in Chapter 6. That is, we will start with a very rough description of our program and continually refine it using the sequential, decision, and iteration constructs (see Chapter 6 if you need a refresher) until we arrive at something from which we can easily write C code. This technique is called top-down
decomposition because we start with a rough description of the algorithm and refine it by breaking larger steps into smaller ones, eventually arriving at some- thing that resembles a program. Many experienced programmers rely on their understanding of the lower levels of the system to help make good decisions on how to decompose a problem. That is, in order to reduce a problem into a program, good programmers rely on their understanding of the basic primitives of systems they are programming on. In our case (at this point), these basic primitives are variables of the three C types and the operations we can perform on them.
In the subsequent chapters, we will go through several problem-solving exam- ples to illustrate this top-down process. In doing so, we hope to provide you with a sense of the mental process a programmer might use to solve such problems.
The very first step (step 0) we need to consider for all problems from now on is how we represent the data items that the program will need to manipulate. At this point, we get to select from the three basic C types: integer, character, and floating point. For this problem, we can represent our internal calculations with either floating point values or integers. Since we are ultimately interested in displaying the result as hours, minutes, and seconds, any fractional components of
Figure 12.3
Stepwise refinement of a simple network transfer time problem
Step 1
Start
Get input data
Calculate results
Output results
Stop
Step 2
Start
Get input data
Calculate transfer time in seconds
Convert seconds
to hours, minutes,
seconds
Output results
Stop
Step 3
---
\
\
\
\
\
/
Convert total seconds to hours
Convert remaining seconds to minutes
Calculate remaining seconds
12.4 Problem Solving Using Operators 325
/ /
/
\
\
\
\
/
\
,,
1)
326
chapter 12 Variables and Operators
time are unnecessary. For example, displaying the total transfer time as I0.1 hours, 12.7 minutes, 9.3 seconds does not make sense. Rather, 10 hours, 18 minutes, 51 seconds is the preferred output. Because of this, the better choice of data type for the time calculation is integer (yes, there are rounding issues, but say we can ignore them for this calculation).
Having chosen our data representations, we can now apply stepwise refine- ment to decompose the problem. Figure 12.3 shows our decomposition of this particular programming problem. Step I in the figure shows the initial formula- tion of the problem. It involves three phases: get input, calculate results, output results. ln the first phase, we will query the user about the amount of data to be transfered (in bytes) and the transfer rate of the network (in bytes per second). In the second phase, we will perform all necessary calculations, which we will then output in the third phase.
Step I is not detailed enough to translate directly into C code, and therefore we perform another refinement of it in step 2. Here we realize that the calculation phase can be further refined into a subphase that first calculates total time in seconds-which is an easy calculation given the input data-and a subphase to convert total time in seconds into hours, minutes, and seconds.
Step 2 is still not complete enough for mapping into C; we perform another refinement of it in step 3. Most phases of step 2 are fairly simple enough to convert into C, except for the conversion of seconds into hours, minutes, and seconds. In step 3, we refine this phase into three subphases. First we will calculate total hours ba~ed on the total number of seconds. Second, we will use the remaining seconds to calculate minutes. Finally, we determine the remaining number of seconds after the minutes have been calculated.
Based on the total breakdown of the problem after three steps of refinement presented in Figure 12.3, it should be fairly straightforward to map out the C code. The complete C program for this problem is presented in Figure 12.4.
12.5 TuingItRIITogether
We've now covered all the basic C types and operators that we plan to use through- out this textbook. Having completed this first exposure, we arc now ready to examine these concepts from the compiler's viewpoint. That is, how does a com- piler translate code containing variables and operators into machine code. There are two basic mechanisms that help the compiler do its job of translation. The compiler makes heavy use of a symbol table to keep track of variables during compilation. The compiler also follows a systematic partitioning of memory-it carefully allocates memory to these variables based on certain characteristics, with certain regions ot memory reserved for objects of a particular class. In this section, we'll take a closer look at these two processes.
12.5.1 Symbol Table
In Chapter 7, we examined how the assembler systematically keeps track of labels within an assembly program by using a symbol table. Like the assembler, the C
#include
{
}
int amount; int rate; int time;
I* The number of bytes to be transferred
/* The average network transfer rate */ I* The time, in seconds, for the transfer *I
int hours;
int minutes; /* The number of rnins for the transfer */ int seconds; /* The number of secs for the transfer *I
/* Get input: number of bytes and network transfer rate*/ printf( 11 How many bytes of data to be transferred? “);
/* The number of hours for the transfer
*I
scanf(11 %d11 ,
printf(“What is the transfer rate (in bytes/sec)? “);
&amount) ;
scanf (“%d”, &rate);
/* Calculate total time in seconds time~ amount/ rate;
*/
/* Convert time into hours, minutes, seconds */ hours= time/ 3600; /* 3600 seconds in an hour*/ minutes (time% 3600) / 60; /* 60 seconds in a minute*/ seconds= ((time% 3600) % 60); /* remainder is seconds*/
/* Output results*/
printf(“Time : %dh %dm %ds\n11
hours, minutes, seconds); Figure 12.4 A C program that performs a simple network rate calculation
compiler keeps track of variables in a program with a symbol table. Whenever the compiler reads a variable declaration, it creates a new entry in its symbol table corresponding to the variable being declared. The entry contains enough information for the compiler to manage the storage allocation for the variable and generation of the proper sequence of machine code whenever the variable is used in the program. Each symbol table entry for a variable contains (I) its name, (2) its type, (3) the place in memory the variable has been allocated storage, and (4) an identifier to indicate the block in which the variable is declared (i.e., the scope of the variable).
Figure 12.5 shows the symbol table entries corresponding to the variables declared in the network rate calculation program in Figure 12.4. Since this pro- gram contains six variables declarations, the compiler ends up with six entries in its symbol table for them. Notice that the compiler records a variable’s location in memory as an offset, with most offsets being negative. This offset indi- cates the relative position of the variable within the region of memory it is allocated.
,
12.5 Tying It All Together 327
*I
I
!
I 1, ij
Figure 12.5 The compiler’s symbol table when it compiles the program from Chapter 11
12.5.2 Allocating Space for Variables
There are two regions of memory in which C variables are allocated storage: the global data section and the run-time stack.2 The global data section is where all global variables are stored. More generally, it is where variables of the static storage class are allocated (we say more about this in Section 12.6). The run-time stack is where local variables (of the default automatic storage class) are allocated storage.
The offset field in the symbol table provides the precise information about where in memory variables are actually stored. The offset field simply indicates how many locations from the base of the section a variable is allocated storage.
For instance, if a global variable earth has an offset of 4 and the global data section starts at memory location 0x5000, then e a r t h is stored in location 0x5004. All our examples of compiler-generated machine code use R4 to contain the address of the beginning of the global data section-R4 is referred to as the global pointer. Loading the variable earth into R3, for example, can be accomplished with the following LC-3 instruction:
LDR R3, R4, #4
If earth is instead a local variable, say for example in the function main, the story is slightly more complicated. All local variables for a function are allocated in a “memory template” called an activation record or stack frame. For now, we’ll examine the format of an activation record and leave the motivation for why we need it for Chapter 14 when we discuss functions. An activation record is a region of contiguous memory locations that contains all the local variables for a given function. Every function has an activation record (or more precisely, every invocation of a function has an activation record-more on this later).
2
For examples in this textbook, all variables will be assigned a memory location. However, real compilers perform code optimizations that attempt to allocate variables in registers. Since registers
take less time to access than memory, the program will run faster if frequently accessed values are put into registers.
328
chapter 12 Variables and Operators
Identifier Type Location (as an offset)
amount int 0
hours int
minutes int -4 rate int -1 seconds int -5 time int -2
Scope
Other
info…
main … -3 main …
main … main … main … main …
Figure 12.6
An example of an activation record in the LC-3’s memory. This function has five local variables. R5 is the frame pointer and points to the first local variable
t Loeation xODOO
RS
Loeation xFFFF
l
seconds minutes hours time rate amount
Whenever we are executing a particular function, the highest memory address of the activation record will be stored in R5-R5 is called the frame pointer. For example, the activation record for the function main from the code in Figure 12.4 is shown in Figure 12.6. Notice that the variables are allocated in the record in the reverse order in which they are declared. Since the variable amount is declared first, it appears nearest to the frame pointer R5.
If we make a reference to a particular local variable, the compiler will use the variable’s symbol table entry to generate the proper code to access it. In particular, the offset in the variable’s symbol table entry indicates where in the activation record the variable has been allocated storage. To access the variable seconds, the compiler would generate the instruction:
LDR RO, RS, #-5
A preview of things to come: Whenever we call a function in C (in C, sub- routines are called functions), the activation record for the function is pushed on to the run-time stack. That is, the function’s activation record is allocated on top of the stack. R5 is appropriate!y adjusted to point to the base of the record- therefore any code within the function that accesses local variables will now work correctly. Whenever the function completes and control is about to return to the caller, the activation record is popped off the stack. R5 is adjusted to point to the caller’s activation record. Throughout all of this, R6 always contains the address of the top of the run-time stack-it is called the stack pointer. We will revisit this in more detail in Chapter 14.
Figure 12.7 shows the organization of the LC-J’s memory when a program is running. Many UNIX-based systems arrange their memory space similarly.
12.5 Tying It All Together 329
330 chapter 12 Variables and Operators xOOOO
-PC f—————J.,>–R4
Global data section
Heap
(for dynamically allocated memory)
Program text
f——~ ~ —–~ i.— – R6(Stackpointer) – RS (Frame pointer)
Run-time stack
xFFFF
Figure 12.7 The LC-3 memory map showing various sections active during program
execution
The program itself occupies a region of memory (labelled Program text in the diagram); so does the run-time stack and the global data section. There is another region reserved for dynamically allocated data called the heap (we will discuss this region in Chapter 19). Both the run-time stack and the heap can change size as the program executes. For example, whenever one function calls another, the run- time stack grows because we push another activation record onto the stack-in fact, it grows toward memory address xOOOO. In contra~t, the heap grows toward OxFFFF. Since the stack grows toward xOOOO, the organization of an activation record appears to be “upside-down”: that is, the first local variable appears at the memory location pointed to by RS, the next one at RS – 1, the subsequent one at RS – 2, and so forth (as opposed to RS, RS +I, RS +2, etc).
During execution, the PC points to a location in the program text, R4 points to the beginning of the global data section, RS points within the run-time stack,
and R6 points to the very top of the run-time stack. There are certain regions of memory, marked System space in Figure 12.7, that are reserved for the operating system, for things such as TRAP routines, vector tables, 1/0 registers, and boot code.
12.5.3 A Comprehensive Example
Now that we have examined the LC-3 compiler’s techniques for tracking and allocating space for variables in memory, let’s take a look at a comprehensive C example and its translation into LC-3 code.
Figure 12.8 is a C program that performs some simple operations on integer variables and then outputs the results of these operations. The program contains one global variable, inGlobal, and three local variables, inLocal, outLocalA, and outLocalB, which arc local to the function main.
The program starts off by assigning initial values to inLocal and inGlobal. After the initialization step, the variables outLocalA and outLocalB are updated based on two calculations performed using inLocal and inGlobal. After the calculation step, the values of outLocalA and outLocalB are output using the printf library function. Notice because we are using printf, we must include the standard 1/0 library header file, stdio. h.
When analyzing this code, the LC-3 C compiler will a~sign the global vari- able inGlobal the first available spot in the global data section, which is at offset 0. When analyzing the function main, it will assign inLocalA to offset 0, outLocalA to offset -1, and outLocalB to offset -2 within main’s activation
/* Include the standard I/0 header file*/ #include
AND RO, ADD RO, STR RO,
AND RO, ADD RO, STR RO,
LDR RO, LDR Rl, NOT Rl, AND R2, STR R2,
LDR RO, LDR Rl, ADD RO,
LDR R2, LDR R3, NOT R3 ADD R3,
ADD R2, NOT R2 ADD R 2 ,
ADD RO, STR RO,
RO, #O RO, #5 R5, #0
RO, #0 RO, #3 R4, #0
RS, #0 R4, #0 Rl
RO, Rl RS, #-1
RS, #0 R4, #0 RO, Rl
RS, #0 R4, #0
R3, #1
R2, R3
R2, #1
RO, R2 RS, #-2
inLocal is at offset 0 inLocal = 5·
inGlobal is at offset 0, in globals inGlobal = 3;
get value
get value -inGlobal calculate outLocalA outLocalA
get value get value calculate
of inLocal
of inGlobal
inLocal & -inGlobal
= inLocal & -inGlobal; is at offset -1
of inLocal
of inGlobal inLocal + inGlobal
12.6 Additional Topics 333
get value of inLocal
get value of inGlobal
calculate -inGlobal
calculate inLocal – inGlobal calculate -(inLocal – inGlobal)
(inLocal + inGlobal) – (inLocal – inGlobal) outLocalB
outLocalB is at offset -2
Figure 12.10 The LC-3 code for the C program in Figure 12.8
The modifier s h o r t can be used to create variables that are smaller than the default size, which can be useful when trying to conserve on memory space when handling data that does not require the full range of the default data type. The following example demonstates how the variations are declared:
long double particlesinUniverse;
long int worldPopulation; short int ageOfStudent;
334
chapter 12 Variables and Operators
Because the size of the three basic C types is closely tied to the types sup- ported by the underlying ISA, many compilers only support these modifiers l o n g and s h o r t if the computer’s ISA supports these size variations. Even though a variable can be declared as a long int, it may be equivalent to a regular int if the underlying ISA has no support for longer versions of the integer data type. See Appendix D.3.2 for more examples and additional information on long and short.
Another useful variation of the basic i n t data type is the unsigned integer. We can declare an unsigned integer using the u n s i g n e d type modifier. With unsigned integers, all bits are used to represent nonnegative integers (i.e., positive numbers and zero). In the LC-3 for instance, which has 16-bit integers, an unsigned integer has a value between Oand 65,535. When dealing with real-world objects that by nature do not take on negative values, unsigned integers might be the data type of choice. The following are examples of unsigned integers:
unsigned int numberOfDays; unsigned int populationSize;
Following are some sample variations of the three basic types:
long int ounces;
short int gallons;
long double veryVeryLargeNumber unsigned int sizeOfClass = 900; float oType = 9.24;
float tonsOfGrain = 2.998E8;
4.12936E361;
12.6.2 Literals, Constants, and Symbolic Values
In C, variables can also be declared as constants by adding the c o n s t qualifier before the type specifier. These constants are really variables whose values do not change during the execution of a program. For example, in writing a program that calculates the area and circumference of a circle of a given radius, it might be useful to create a floating point constant called p i initialized to the value 3.14159. Figure 12.11 contains an example of such a program.
This example is useful for making a distinction between three types of con- stant values that often appear in C code. Literal constants are unnamed values that appear literally in the source code. In the circle example, the values 2 and 3 .14159 are examples of literal constants. In C, we can represent literal con- stants in hexadecimal by prepending a ox in front of them, for example OxlDB.
ASCII literals require single quotes around them, as for example ‘ R • , which is the ASCII value of the character R. Floating point literals can be the expo- nential notation described in Section 12.2.1. An example of the second type of constant value is p i , which is declared as a constant value using a variable decla- ration with the c o n s t qualifier. The third type of constant value is created using the preprocessor directive #define, an example of which is the symbolic value RADIUS. All three types create values that do not change during the execution of a program.
10 11 12 13 14 15 16 17 18 19 20 21 22
/*Calculations*/
area= pi* RADIUS* RADIUS;
1 #include
2
3 #define RADIUS 15.0
4
5 int main() 6{
7
8
9
/* This value is in centimeters*/
/*area= pi*r”‘2 */
const double p i= double area;
double circumference;
3.14159;
circumference= 2 * p i* RADIUS; /*circumference=*/ /* 2*pi•r */
printf(“Area of a circle with radius %f cm is %f cm”‘2\n.., RADIUS, area) ;
printf(“Circumference of the circle is %f cm\011 , circumference};
Figure 12.11 A C program that computes the area and circumference of a circle with a radius of 15 cm
The distinction between constants declared using c a n s t and symbolic values defined using # d e f i n e might seem a little subtle to you. Using one versus another is really a matter of programming style rather than function. Declared constants are used for things we traditionally think of as constant values, which are values that never change. The constant p i is an example. Physical constants such as the speed of light, or the number of days in a week, arc conventionally represented by declared constants.
Values that stay constant during a single execution of the program but which might be different from user to user, or possibly from invocation to invocation, are represented by symbolic values using #define. Such values can be thought of as parameters for the program. For example, RADIUS in Figure 12.11 can be changed and the program recompiled, then re-executed.
In general, naming a constant using const or #define is preferred over leaving the constant as a literal in your code. Names convey more meaning about your code than unnamed literal values.
12.6.3 Storage Class
Earlier in the chapter, we mentioned three basic properties of a C variable: its identifier, its type, and its scope. There is another: storage class. The storage class of a variable indicates how the C compiler allocates its storage, and in particular indicates whether or not the variable loses its value when the block that contains it has completed execution. There are two storage classes in C: static and automatic.
12.6 Additional Topics 335
336
chapter 12 Variables and Operators
Static variables retain their values between invocations. Automatic variables lose their values when their block terminates. In C, global variables are of static storage class, that is, they retain their value until the program ends. Local variables are by default of automatic storage class. Local variables can be declared as static class variables by using the s t a t i c modifier on the declaration. For example, the variable declared by static int localvar; will retain its value even when its function completes execution. If the function is executed again (during the same program execution), localVar will retain its previous value. In particular, the use of the s t a t i c keyword on a local variable causes the compiler tosallocate storage for the variable in the global data section, while keeping it private to its block. See Appendix D.3.3 for additional examples on storage class.
12.6.4 Additional C Operators
The C programming language has a collection of unusual operators, which have become a trademark of C programming. Most of these operators are combinations of operators we have already seen. The combinations are such that they make expressing commonly used computations even simpler. However, to someone who is not accustomed to the shorthand notation of these operators, reading and trying to understand C code that contains them can be difficult.
Assignment Operators
C also allows certain arithmetic and bitwise operators to be combined with the assignment operator. For instance, if we wanted to add 29 to variable x, we could use the shorthand operator+= as follows:
X += 29;
This code is equivalent to X =X+29;
Table 12.6 lists some of the special operators provided by C. The postfix operators have highest precedence, followed by prefix. The assignment operators have lowest precedence. Each group associates from right to left.
Operator symbol
+=
*=
I=
%= &= l=
«= »=
Operation
add and assign subtract and assign multiply and assign divide and assign modulus and assign and and assign
or and assign
xor and assign left-shift and assign right-shift and assign
Example usage
X += y X -y
X *= y X I= y X %=c y
X &= y
X l= y X y
X «= y X •= y
10 11 12 13 14 15 16
&inputl);
printf (“Input another integer: 11 ) ;
More examples are as follows:
h += g; h %= f; h <<= 3;
I* Equivalent to h /* Equivalent to h I* Equivalent to h
h + g; *I h % f; */ h << 3; *I
Conditional Expressions
Conditional expressions are a unique feature of C that allow for simple decisions to be made with a simple expression. The symbols for the conditional expression are the question mark and colon, ? and :. The following is an example:
X=a?b:Ci
Here variable x will get either the value of b or the value of c based on the logical value of a. If a is nonzero, x will get the value of b. Otherwise, it will get the value of c.
Figure 12.12 is a complete program that uses a conditional expression to calculate the maximum of two integers. The maximum of these two input values is determined by a conditional expression and is assigned to the variable maxV alue. ThevalueofmaxValue isoutputusingprintf.
1 #include
2
3 int main() 4{
5
6
7
8
9
int maxValue; int inputl; int input2;
printf (11 Input an integer: 11 ) ;
scanf( 1’%d11 ,
scanf(11%d11 ,
maxValue = (inputl > input2) ? inputl : input2;
&input2) ;
printf(“The larger number is %d\n11
maxValue);
We conclude this chapter by summarizing the three key concepts we covered.
• Variables in C. The C programming language supports variables of three basic types: integers ( i n t ) , characters ( c h a r ) , and floating point numbers (double). C, like all other high-level languages, provides the programmer the ability to provide symbolic names to these variables. Variables in C can be locally
Figure 12.12 A C program that uses a conditional expression
12.7 Summarq
1
12.7 Summary 337
338
chapter 12 Variables and Operators
Exercises
declared within a block of code (such as a function) or globally visible by all blocks.
• Operators in C. C’s operators can be categorized by the function they per- form: assignment, arithmetic, bitwise manipulations, logical and relational tests. We can form expressions using variables and operators such that the expressions get evaluated according to precedence and associativity rules. Expressions are grouped into statements, which express the work the program is to perform.
• Translating C Variables and Operators into LC-3 Code. Using a symbol table to keep track of variable declarations, a compiler will allocate local variables for a function within an activation record for the function. The activation record for the function is pushed onto the run-time stack whenever the function is executed. Global variables in a program are allocated in the global data section.
12.1 Generate the compiler’s symbol table for the following code. Assume all variables occupy one location in memory.
double ff; char cc; int ii; char dd;
12.2 The following variable declaration appears in a program: int r;
a. If r is a local variable, to what value will it be initialized?
b. U r if a global variable, to what value will it be initialized?
12.3 What are the ranges for the following two variables if they are stored as 32-bit quantities?
int plusOrMinus; unsigned int positive;
12.4 Evaluate the following floating point literals. Write their values in standard decimal notation.
a. IIIE-11
b. -0.00021 E 4 c. IOI.IOIE0
12.5
12.6
Exercises 339 Write the LC-3 code that would result if the following local variable
declarations were compiled using the LC-3 C compiler:
char C = ,a,; int X =3; int y;
int z=10;
For the following code, state the values that are printed out by each
p r i n t f statement. The statements are executed in the order A, B, C, D.
int t; /* This variable is global*/ int t = 2;
12.7
Given that a and bare both integers where a and b have been assigned the values 6 and 9, respectively, what is the value of each of the following expressions? Also, if the value of a orb changes, give their new value.
a. a I b
b. a 11 b
C. a &b
d. a && b
e. ! (a + b)
fa%b
g. b / a
h. a = b
l. a = b = 5
J. ++a + b–
k. a = (++b < 3) ? a : b l. a <<= b
For the following questions, write a C expression to perform the following relational test on the character variable l e t t e r .
a. Test if l e t t e r is any alphabetic character or a number.
b. Test if l e t t e r is any character except an alphabetic character or a
12.8
number.
printf("%d\n", t) ;
{
t =3.'
/* A */ /* B */
t); /*C*/
printf I"%d\n", t);
}
printf (11 %d\n11
} {
,
printf ("%d\n", t) ; /* D */
340 chapter 12
Variables and Operators
12.9
12.10 12.11
a.
b.
What does the following statement accomplish? The variable l e t t e r is a character variable.
letter= ((letter>= ‘a’ && letter<= 'z') ? '!'
Modify the statement in (a) so that it converts lowercase to uppercase.
12.12
12.13
12.14
Write a program that reads an integer from the keyboard and displays a 1 if it is divisible by 3 or a 0 otherwise.
Explain the differences between the following C statements:
a. j i++;
b. j ++i;
C.j i+l·
' d. i += 1;
e. j = i += 1;
f Which statements modify the value of i? Which ones modify the
value of j? If i = 1 and j = o initially, what will the values of i and j be after each statement is run separately?
Say variables a and bare both declared locally as long int.
a. Translate the expression a + b into LC-3 code, assuming a
long int occupies two bytes. Assume a is allocated at offset 0 and bis at offset -I in the activation record for their function.
b. Translatethesameexpression,assumingalong intoccupiesfour bytes, a is allocated offset 0, and b is at offset -2.
Ifinitially, a = 1, b = 1, c = 3, and result = 999, what are the values of the variables after the following C statement is executed?
result= b + 1 I c + a;
Recall the machine busy example from Chapter 2. Say the integer variable machineBusy tracks the busyness of all 16 machines. Recall that a 0 in a particular bit position indicates the machine is busy and a I in that position indicates the machine is idle.
a. Write a C statement to make machine 5 busy.
b. Write a C statement to make machine 10 idle.
c. Write a C statement to make machine n busy. That is, the machine
that has become busy is an integer variable n.
d. Write a C expression to check if machine 3 is idle. If it is idle, the
expression returns a I. If it is busy, the expression returns a 0.
e. Write a C expression that evaluates to the number of idle machines.
For example, if the binary pattern in machineBusy were
1011 0010 1110 1001, then the expression will evaluate to 9.
12.15 12.16
12.17
What purpose does the semicolon serve in C?
Say we are designing a new computer programming language that includes the operators @, #, $ and u. How would the expression
w @ x # y $ z u a get evaluated under the following constraints?
a. The precedence of@ is higher than # is higher than $ is higher than u. Use parentheses to indicate the order.
b. The precedence of# is higher than u is higher than@ is higher than $.
c. Their precedence is all the same, but they associate left to right.
d. Their precedence is all the same, but they associate right to left.
Notice that the C assignment operators have the lowest precedence. Say we have developed a new programming language called Q that works exactly like C, except that the assignment operator had the highest precedence.
a. What is the result of the following Q statement? In other words, what would the value of x be after it executed?
X ::c X + l;
b. How would we change this Q statement so that it works the same way as it would in C?
Modify the example program in Chapter 11 (Figure 11.3) so that it prompts the user to type a character and then prints every character from that character down to the character ! in the order they appear in the ASCII table.
Write a C program to calculate the sales tax on a sales transaction. Prompt the user to enter the amount of the purchase and the tax rate. Output the amount of sales tax and the total amount (including tax) on the whole purchase.
Suppose a program contains the two integer variables x and y , which have values 3 and 4, respectively. Write C statements that will exchange the values in x and y so that after the statements are executed, xis equal to 4 and y is equal to 3.
a. First, write this routine using a temporary variable for storage. b. Now rewrite this routine without using a temporary variable.
12.18
12.19
12.20
Exercises 341
Control Structures
13.l Introduction
In Chapter 6, we introduced our top-down problem-solving methodology where a problem is systematically refined into smaller, more detailed subtasks using three programming constructs: the sequential construct, the conditional construct, and the iteration construct.
We applied this methodology in the previous chapter to derive a simple C program that calculates network transfer time. The problem's refinement into a program only required the use of the sequential construct. For transforming more complex problems into C programs, we will need a way to invoke the conditional and iteration constructs in our programs. In this chapter, we cover C's version of these two constructs.
We begin this chapter by describing C's conditional constructs. The if and i f - e l s e statements allow us to conditionally execute a statement. After condi- tional constructs, we move on to C's iteration constructs: the for, the while, and the do-while statements, all of which allow us to express loops. With many of these constructs, we will present the corresponding LC-3 code generated by our
hypothetical LC-3 C compiler to better illustrate how these constructs behave at the lower levels. C also provides additional control constructs, such as the s w i t c h , break, and continue statements, all of which provide a convenient way to rep- resent some particular control tasks. We discuss these in Section 13.5. In the final part of the chapter, we'll use the top-down problem-solving methodology to solve some complex problems involving control structures.
chapter
13
344
chapter 13 Control Structures
13.2 ConditionalConstructs
Conditional constructs allow a programmer to select an action based on some condition. This is a very common programming construct and is supported by every useful programming language. C provides two types of basic conditional constructs: if and if-else.
13.2.1 The if Statement
The i f statement is quite simple. It performs an action if a condition is true. The action is a C statement, and it is executed only if the condition, which is a C expression, evaluates to a nonzero (logically true) value. Let's take a look at an example.
if (x <= 10}
y = X * X-+ 5;
The statement y = x * x + s; is only executed if the expression x <= 10 is nonzero. Recall from our discussion of the<= operator (the less than or equal to operator) that it evaluates to I if the relationship is true, 0 otherwise.
The statement following the condition can also be a compound statement, or block, which is a sequence of statements beginning with an open brace and ending with a closing brace. Compound statements are used to group one or more simple statements into a single entity. This entity is itself equivalent to a simple statement. Using compound statements with an i f statement, we can conditionally execute several statements on a single condition. For example, in the following code, both y and z will be modified if x is less than or equal to 10.
if (x <= 10) { yX*X+5;
z = (2 * y) I 3;
As with all statements in C, the format of the i f statement is flexible. The line breaks and indentation used in the preceding example are features of a popular style for formatting an i f statement. It allows someone reading the code to quickly identify the portion that executes if the condition is true. Keep in mind that the for- mat does not affect the behavior of the program. Even though the following code is indented like the previous code, it behaves differently. The second statement z = (2 * y) I 3 ; is not associated with the if and will execute regardless of the condition.
if (x <= 10)
y X*X+5;
z = (2 * y) / 3;
Figure 13.1
F
Condition>—
T
Action
The C i f statement, pictorially represented
Figure 13.1 shows the control flow of an if statement. The diagram corresponds to the following code:
if (condition) action;
Syntactically, the condition must be surrounded by parentheses in order to enable the compiler to unambiguously separate the condition from the rest of the i f statement. The action must be a simple or compound statement.
Here are more examples of if statements demonstrating programming situations where this decision construct might be useful.
if (temperature<- 0)
printf( 11 At or below freezing point.\n 11 ) ;
if {'a' <= key && key<= 'z') numLowerCase++;
if (current> currentLimit) blownFuse = 1;
if (loadMAR & clock) registerMAR = bus;
if (month== 4 II month=- 6 II month printf(“The month has 30 days\n”);
9 11 month 11)
if (x = 2) /* This condition is always true.
y = 5; /* The variable y will always be 5 */
13.2 Conditional Constructs 345
*I
346
chapter 13 Control Structures
The last example in the preceding code illustrates a very common mistake made when programming in C. (Sometimes even expert C programmers make this mistake. Good C compilers will warn you if they detect such code.) The condition uses the assignment operator =rather than the equality operator, which causes the value of x to change to 2. This condition is always true: expressions containing the assignment operator evaluate to the value being assigned (in this case, 2). Since the condition is always nonzero, y will always get assigned the value 5 and x will always be assigned 2.
Even though they look similar at first glance, the following code is a “repaired” version of the previous code.
if (x == 2) y~s;
Let’s look at the LC-3 code that is generated for this code, assuming that x and y are integers that are locally declared. This means that Rs will point to the variable xandRS – 1 willpointtoy.
NOT TRUE
LDR RO, RS, #0 load x into RO ADD RO, RO, #-2 subtract 2 from x
If condition is not true,
then skip the assignment
AND RO, RO, #0
ADD RO, RO, #5 STRRO,RS,#-1 y=5;
the rest of the program
BRnp NOT TRUE
RO<- 0 RO<- S
Notice that it is most straightforward for the LC-3 C compiler to generate code that tests for the opposite of the original condition (x not equal to 2) and to branch based on its outcome.
The i f statement is itself a statement. Therefore, it is legal to nest an i f statement as demonstrated in the following C code. Since the statement following the first i f is a simple statement (i.e., composed of only one statement), no braces are required.
if (x ="°- 3)
if (y !=6) {
z z+1; w = w + 2;
The inner i f statement only executes if xis equal to 3. There is an easier way to express this code. Can you do it with only one i f statement? The following code demonstrates how.
if((x---- 3)&&(y!-6I){ zz+1;
w-w+2;
13.2.2 The if-else Statement
If we wanted to perform one set of actions if a condition were true and another set if the same condition were false, we could use the following sequence of i f statements:
if (temperature<~ 0)
printf (11 At or below freezing point. \n'1 ) ;
if (temperature> O)
p:rintf (11 Above freezing. \n11 l;
Here, a single message is printed depending on whether the variable temperature is below or equal to zero or if it is above zero. It turns out that this type of conditional execution is a very useful construct in programming. Since expressing code in the preceding way can be a bit cumbersome, C provides a more convenient construct: the i f – e l s e statement.
The following code is equivalent to the previous code segment.
if (temperature<~ 0)
printf( 11 At or below freezing point.\n");
else
printf( 11 Above freezing. \n 11 ) ;
Here, the statement appearing immediately after the e l s e keyword executes only if the condition is false.
The flow diagram for the if-else is shown in Figure 13.2. The figure corresponds to the following code:
if (condition) action_if;
else
action else;
The lines action_ if and action_else can correspond to compound statements and thus consist of multiple statements, as in the following example.
13.2 Conditional Constructs
347
348
chapter 13
Control Structures
Figure 13.2
if (x) { y++;
z--;
}
else { y--; z++;
TF
---< Condition > –~
Action_if Action_else
The Cif-else statement, pictorially represented
If the variable x is nonzero, the i f’s condition is true, y is incremented, and z decremented. Otherwise, y is decremented and z incremented. The LC-3 code generated by the LC-3 C compiler is listed in Figure 13.3. The three variables x, y, and z are locally declared integers.
We can connect conditional constructs together to form a longer sequence of conditional tests. The example in Figure 13.4 shows a complex decision structure created using the if and if-else statements. No other control structures are used. This program gets a number of a month from the user and displays the number of days in that month.
At this point, we need to mention a C syntax rule for associating ifs with elses: An else is associated with the closest unassociated if. The following example points out why this is important.
if Ix != 10) if (y > 3)
z zI2; else
z z*2;
1 2 3 4 5 6 7 8 9
10
11
12 13 14 15 16 17 18 19 20 21
LDR RO, R5, #0 BRz ELSE
load the value of X
if X equals 0, perform else part
load y into RO y++;
load z into RO z–;
load y into RO
y–i
load z into RO z++;
ELSE: LDR ADD STR
LDR ADD STR
RO, R5, #-1 RO, RO, #-1 RO, R5, #-1
RO, R5, #-2 RO, RO, #1 RO, R5, #-2
DONE:
Figure 13.3
The LC-3 code generated for an i f – e l s e statement
LDR RO, ADD RO, STR RO,
LDR RO, ADD RO, STR RO, BR DONE
R5, #-1 RO, #1 R5, #-1
R5, #-2 RO, #-1 R5, #-2
1 #include
2
3 int main()
4{
5
6
7
8
9
10
11
12 13 14 15 16 17 18 19
int month;
printf( 11 Enter the number of the month: 11 ) ;
scanf (11 %d11
,
&month);
if (month == 4 11 month == 6 11 month == 9 II month 11) printf(“The month has 30 days\n”);
else if (month== 1 II month== 3 II month month== 7 II month== 8 II month
printf (“The month has 31 days\n”); else if (month== 2)
s 11
10 11 month
12)
printf(“The month has either 28 days or 29 days\n”); else
printf( 11 Don’t know that month\n11 ) ;
Figure 13.4 A program that determines the number of days in a month
}
13.2 Conditional Constructs
349
350
chapter 13 Control Structures
Without this rule, it would not be clear whether the else should be paired with the outer i f or the inner i f . For this situation, the rule states that the e l s e is coupled with the inner i f because it is closer than the outer i f and the inner i f statement has not already been coupled to another e l s e (i.e., it is unassociated). The code is equivalent to the following:
if (x != 10) { if (y > 3)
z zI2; else
z z*2;
Just as parentheses can be used to modify the order of evaluation of expres- sions, braces can be used to associate statements. If we wanted to associate the else with the outer if, we could write the code as
if (x != 10) { if (y > 3)
z = z I 2;
else
z = z * 2;
Before we leave the i f – e l s e statement for bigger things, we present a very common use for the if-else construct. The if-else statement is very handy for checking for bad situations during program execution. We can use it for error checking, as shown in Figure 13.5. This example performs a simple division ba~ed on two numbers scanned from the keyboard. Because division by Ois undefined, if the user enters a Odivisor, a message is displayed indicating the result cannot be generated. The i f – e l s e statement serves nicely for this purpose.
Notice that the nonerror case appears in the i f – e l s e statement first and the error case second. Although we could have coded this either way, having the common, nonerror case first provides a visual cue to someone reading the code that the error case is the uncommon one.
13.3 IterationConstructs
Being able to iterate, or repeat, a computation is part of the power of computing. Almost all useful programs perform some form o~ iteration. In C, there are three iteration constructs, each a slight variant ofthe others: the while statement, the
for statement, and the do-while statement.
13.3.1 Thewhile Statement
WebeginbydescribingC’ssimplestiterationstatement:thewhile.Awhile loop executes a statement repeatedly while a condition is true. Before each iteration
1 #include
2
3 int main(}
4{
5 6 7 8 9
10
11
12
13
14
15
16
17
18
19 else
20 printf(“A divisor of zero is not allowed\n”}; 21 }
Figure 13.5 A program that has error-checking code
of the statement, the condition is checked. If the condition evaluates to a logical true (nonzero) value, the statement is executed again.
In the following example program, the loop keeps iterating while the value of variable xis less than 10. It produces the following output:
0123456789 #include
int main() {
int X 0;
while (x < 10) ( printf(11 %d 11 x); X = X + 1;
}
The while statement can be broken down into two components. The test condition is an expression used to determine whether or not to continue executing the loop.
int dividend; int divisor; int result;
printf ("Enter the dividend: 11 ) ;
scanf (11 %d11
÷nd} ;
printf (11 Enter the divisor: 11 ) ;
scanf( 11 %d11 ,
&divisor);
,
if (divisor != 0) (
result= dividend/ divisor;
printf("The result of the division is %d\n", result);
13.3 Iteration Constructs. 351
352
chapter 13
Control Structures
Figure 13.6
F Test
T
Loop body
The Cwhile statement, pictorially represented
while (test) loop_body;
Itistestedbeforeeachexecutionoftheloop body.Theloop_bodyisastatement that expresses the work to be done within the loop. Like all statements, it can be a compound statement.
Figure 13.6 shows the control flow using the notation of systematic decom- position. Two branches are required: one conditional branch to exit the loop and one unconditional branch to loop back to the test to determine whether or not to execute another iteration.
The LC-3 code generated by the compiler for the w h i l e example that counts from Oto 9 is listed in Figure 13.7.
The while statement is useful for coding loops where the iteration process involves testing for a sentinel condition. That is, we don't know the number of iterations beforehand but we wish to keep looping until some event (i.e., the
1 2 3 4 5 6 7 8 9 10 11
12 13 14 15 1 6 1 7 18
LOOP,
AND RO, RO, #0 STR RO, RS, #0
; while (x < 10) LDR RO, RS, #0 ADD RO, RO, #-10 B R p z DONE
clear out RO X =0;
perform the test
xis not less than 10
Figure 13.7
The LC-3 code generated for a while loop that counts to 9
DONE,
; loop body
LDR RO,RS,#0 ADD RO, RO, #1 STR RO, RS, #0 BR LOOP
RO<-X
X + 1
X =X + 1· '
another iteration
1 #include
char echo = ‘A’;
while (echo!~ ‘\n’) {
scanf {11 %c11 printf(“%c”, echo);
,
&echo);
13.3 Iteration Constructs
/* Initialize char variable echo*/
353
2
3 int main() 4{
5
6
7
8
9
10
11
Figure 13.8 Another program with a simple while loop
sentinel) occurs. For example, when we wrote the character counting program in Chapters 5 and 7, we created a loop that terminated when the sentinel EOTcharac- ter (a character with ASCII code 4) was detected. I f we were coding that program in C rather than LC-3 assembly language, we would use a w h i l e loop. The pro- gram in Figure 13.8 uses the w h i l e statement to test for a sentinel condition. Can you determine what this program does without executing it?1
We end our discussion of the while statement by pointing out a common mistake when using while loops. The following program will never terminate because the loop body does not change the looping condition. In this case, the condition always remains true and the loop never terminates. Such loops are called infinite loops, and most of the time they occur because of programming errors.
#include
int main() {
intX O• ‘
while (x < 10) printf('1 %d" x);
13.3.2 The for Statement
Just as the while loop is a perfect match for a sentinel-controlled loop, the C . f o r loop is a perfect match for a counter-controlled loop. In fact, the f o r loop is a special case of the while loop that happens to work well when the number of
iterations is known ahead of time.
1 This program behaves a bit differently than you might expect. You might expect it to print out each input character as the user types it in. Because of the way C deals with keyboard I/0, the program does not get any input until the user hits the Enter key. We explain why this is so when dealing with
the !ow-level issues surrounding I/0 in Chapter 18.
•
354
chapter 13 Control Structures
In its most straightforward fonn, the for statement allows us to repeat a statement a specified number of times. For example,
#include
int mainI) {
int X;
for (x ~ O; x < 10; x++)
print£(11%d 11 , x);
will produce the following output. It loops exactly 10 times. 0123456789
The syntax for the C for statement may look a little perplexing at first. The f o r statement is composed of four components, broken down as follows:
for (init; test; reinit) loop__body;
The three components within the parentheses, init, test, and reinit, con- trol the behavior of the loop and must be separated by semicolons. The final component, loop_body, specifies the actual computation to be executed in each iteration.
Let's take a look at each component of the for loop in detail. The init component is an expression that is evaluated before the first iteration. It is typically used to initialize variables in preparation for executing the loop.
The t e s t is an expression that gets evaluated before every iteration to deter- mine if another iteration should be executed. If the t e s t expression evaluates to zero, the f o r tenninates and the control flow passes to the statement immediately followingthefor. Iftheexpressionisnonzero,anotheriterationoftheloop_body is perfonned. Therefore, in the previous code example, the test expression x < 1 o causes the loop to keep repeating as long as xis less than 10.
Thereinit componentisanexpressionthatisevaluatedattheendofevery iteration. It is used to prepare (or reinitialize) for the next iteration. In the pre- vious code example, the variable x is incremented before each repetition of the loop body.
The 1oop_body is a statement that defines the work to be perfonned in each iteration. It can be a compound statement.
Figure 13.9 shows the flow diagram of the for statement. There are four blocks, one for each of the four components of the for statement. There is a conditional branch that determines whether to exit the loop based on the outcome of the t e s t expression or to proceed with another iteration. An unconditional
Figure13.9 TheCforstatement
init
F Test
T
Loop body
reinit
branch loops back to the t.est. at the end of each iteration, after the reinit expression is evaluated.
Even though the syntax of a for statement allows it to be very flexible, most ofthefor loopsyouwillencounter(orwillwrite)willbeofthecounter-controlled variety, that is, loops that iterate for a certain number of iterations. Following are some examples of code that demonstrate the counter-controlled nature of for loops.
/* --- What does the loop output? --- */ for (x = O; x <= 10; x++)
printf{"%d ", x);
/* What does this one output? --- */ letter 'a';
for (c O; c < 26; c++) printf(''%c 11 letter+ c);
!* --- What does this loop do? numberOfOnes = O;
•
for (bitNum = O; bitNum < 16; bitNum++) { if (input.V alue & (1 << bitNum))
numberOfOnes++;
--- */
13.3 Iteration Constructs 355
•
356
chapter 13
Control Structures
1 2 3 4 5 6 7 8 9
10
11
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
AND RO, RO, STR RO, RS,
; init
AND RO, RO, STR RO, RS,
;test
LDR RO, RS,
ADD RO, RO, BRpz DONE
; loop body
clear out RO sum= O;
clear out RO init (x =0)
perform the test
xis not less than 10
#0 getX #1
#0 x++
LOOP,
#0 #-1
#0 #0
#0 #-10
DONE,
LDR RO, LDR Rl, ADD Rl, STR RO,
; reinit LDR RO, ADD RO, STR RO, BR LOOP
get X get sum
RS, RO, RS,
#0
#-1
RO sum+X
RS,
RS,
Rl,
RS, #-1 sum =sum + x;
Figure 13.10 The LC-3 code generated for a for statement
Let's take a look at the LC-3 translation of a simple f o r loop. The program is a simple one: it calculates the sum of all integers between Oand 9.
#include
int main() {
int X;
int sum= O;
for (x O; x < 10; x++) sum sum+ x;
The LC-3 code generated by the compiler is shown in Figure 13.10.
The following code contains a mistake commonly made when using for
loops.
sum = O;
for(x O;x<10;x++);
sum= sum+ x;
printf (11 sum = %d\n11 printf(''x = %d\n'r, x);
What is output by the first printf? The answer is sum = 10. Why? The second printf outputs x = 1 o. Why? If you look carefully, you might be able to notice a misplaced semicolon.
A for loop can be constructed using a while loop (actually, vice versa as well). In programming, they can be used interchangeably, to a degree. Which construct to use in which situation may seem puzzling at first, but keep in mind the general rule that while is best suited for loops that involve sentinel conditions, whereas for fits situations where the number of iterations is known beforehand.
Nested Loops
Figure 13.11 contains an example of a for where the loop body is composed of another for loop. This construct is referred to as a nested loop because the inner loop is nested within the outer. In this example, the program prints out a multiplication table for the numbers Othrough 9. Each iteration of the inner loop prints out a single product in the table. That is, the inner loop iterates 10 times for each iteration of the outer loop. An entire row is printed for each iteration of the outer loop. Notice that the printf function call contains a special character sequence in its format string. The \ t sequence causes a tab character to be printed out. The tab helps align the columns of the multiplication table so the output looks neater.
1 #include
2
3 int main() 4{
5
6
7
8
9
/* First operand of each multiply */ /* Second operand of each multiply*/
int multiplicand; int multiplier;
,
sum) ;
I* Outer Loop*/
for (multiplicand= O; multiplicand< 10; multiplicand++) {
/* Inner Loop*/
for (multiplier= 0; multiplier< 10; multiplier++) {
10
11
12
13
14 printf("\n"); 15 }
printf("%'d\t", multiplier* multiplicand);
16
Figure 13.11 A program that prints out a multiplication table
13.3 Iteration Constructs 357
358 chapter 13 Control Structures
10 11 12 13 14 15 16 17 18 19 20 21 22
}
/* Output result*/
printf {11 The result is %d\n11
,
sum) ;
1 #include
2
3 int main() 4{
5
6
7
8
9
/* Initial the result variable*/ /* Holds user input */ /* Iteration variables */
/* Get input*/
printf (11 Input an integer: 11 ) ; scanf (11 %d11 , &input);
/* Perform calculation*/
for (outer= 1; outer<= input; outer++)
for (inner= O; inner< outer; inner++) sum+= inner;
int sum= O; int input; int inner; int outer;
Figure 13.12 A program with a nested for loop
Figure 13.12 contains a slightly more complex example. The number of iter- ations of the inner loop depends on the value of outer as determined by the outer loop. The inner loop will first execute Otime, then 1 time, then 2 times, etc. For a challenging exercise based on this example, see Exercise 13.6 at the end of this chapter.
13.3.3 The do-while Statement
With a while loop, the condition is always evaluated be.fore an iteration is per- formed. Therefore, it is possible for the while loop to execute zero iterations (i.e., when the condition is false from the start). There is a slight variant of the while statement in C called do-while, which always performs at least one iteration. In a do-while loop, the condition is evaluated after the first iteration is performed. The operation of the do-while is demonstrated in the following example:
X=0; do {
printf( 11 %d \n'', x);
X=X+1; while (x < 10);
Herc, the conditional test, x <
Thus, the loop body will execute at least once. The next iteration is performed
1 o, is evaluated at the end of each iteration.
Loop body
F
T
only if the test evaluates to a nonzero value. This code produces the following output:
0123456789
Syntactically, a do-while is composed of two components, exactly like the while.
do loop_body;
while (test);
The l o o p b o d y component is a statement (simple or compound) that describes the computation to be performed by the loop. The t e s t is an expression that determines whether another iteration is to be performed.
Figure 13.13 shows the control flow ofthe do-while loop. Notice the slight change from the flow o f a w h i 1 e loop. The loop body and the test are interchanged. A conditional branch loops back to the top of the loop body, initiating another iteration.
At this point, the differences between the three types of C iteration constructs may seem very subtle, but once you become comfortable with them and build up experience using these constructs, you will more easily be able to pick the right construct to fit the situation. To a large degree, these constructs can be used interchangeably. Stylistically, there are times when one construct makes more sense to use than another--often the type of loop you choose will convey information about the intent of the loop to someone reading your code.
13.4 ProblemSolvingUsingControlStructures
Armed with a new arsenal of control structures, we can attempt to solve more complex programming problems. In this section, we will apply our top-down problem-solving methodology to four problems requiring the use of C control structures.
Figure 13.13 The Cdo-while statement
T est
13.4 Problem Solving Using Control Structures
359
360
chapter 13 Control Structures
Being effective at solving programming problems requires that you under- stand the basic primitives of the system on which you are programming. You will need to invoke them al the appropriate times to solve various programming puzzles. At this point, our list of C primitives includes variables of the three basic types, operators, two decision structures, and three control structures.
13.4.1 Problem 1: Approximating the Value of n
For the first programming problem, we will calculate the value of rr using the
following series expansion:
4 4 4 n-1 4 rr=4--+---+···+(-l) --+ ···
357 2n+1
The problem is to evaluate this series for the number of terms indicated by the user. I f the user enters 3, the program will evaluate 4 - 1+ ~. The series is an infinite series, and the more terms we evaluate, the more accurate our approximation ofrr.
As we did for the problem-solving example in Chapter 12, we first invoke step 0: we select a representation for the data involved in the computation. Since the series deals with fractional numbers, we use the double floating point type for any variables directly involved in the series calculation. Given the nature of the computation, this seems clearly to be the best choice.
Now we invoke stepwise refinement to decompose a roughly stated algorithm into a C program. Roughly, we want the program to initialize all data that requires initialization. Then ask the user to input the number of terms of the series to evaluate. Then evaluate the series for the given number of terms. Finally, print out the result. We have defined the problem as a set of sequential constructs. Figure 13.14 shows the decomposition thus far.
Most of the sequential constructs in Figure 13.14 are very straightforward. Converting them into C code should be quite simple. One of the constructs in the figure, however, requires some additional refinement. We need to put a little thought into the subtask labeled Evaluate series. For this subtask, we essentially want to iterate through the series. term by term, until we evaluate exactly the number of terms indicated by the user. We want to use a counter-controlled iter- ation construct. Figure 13.15 shows the decomposition. We maintain a counter for the current loop iteration. I f the counter is less than the limit indicated by the user, then we evaluate another term. Notice that the refined version of the subtask looks like the flow diagram for a f o r loop.
We arc almost done. The only nontrivial subtask remaining is Evaluate another term. Noticc that all even terms in the series are subtracted, and all odd terms are added. Within this subtask, we need to determine if the particular term we are evaluating is an odd or an even term, and then accordingly factor it into the current value of the approximation. This involves using a decision con- struct as shown in Figure 13.16. The complete code resulting from this stepwise refinement is shown in Figure 13.17.
Start
Initialize
Get input
,'
,,
,'
Evaluates.:.ec.:ri.:.e.:.s_ ___.____ Initialize
iteration count
Evaluate another term
count= count + 1
,' Evaluate series
Output result
Stop
Step 1 Start
Calculate pi using series expansion
Stop
Figure 13.14 The initial decomposition of a program that evaluates the series expansion for Jr for a given number of terms
,, ,,,
Figure 13.15 The refinement of the subtask Evaluate series into an iteration construct that iterates a given number of times. Within this loop, we evaluate terms for a series expansion for 1r
13.4
Problem Solving Using Control Structures 361 Step2
Start
Initialize
Get input
Evaluate series
Output result
Stop
362
chapter 13 Control Structures
Evaluate series
4{ 5
6
7
8
9
1O
11
12
13
int count;
int nurnOfTerms; /* Number of terms to evaluate double p i= O; /* approximation of pi
printf("Number of terms (must be 1 or larger)
14 pi
15 else 16 pi 17 }
18
pi+ (4.0 / (2.0 * count - 1)); /* Odd term */ pi - (4.0 / (2.0 * count - 1)); /* Even term*/
==_i__ _ _
Initialize iteration count
F count < terms
?
T
Evaluate another term
count= count+ 1
Evaluate another term
Add new term
Subtract new term
Figure 13.16 Incorporate the current term based on whether it is odd or even
1 #include
2
3 int main()
scanf(11 %d11
,
&numOfTerms) ;
/* Iteration variable
*/ */ *I
for (count= 1; count<= numOfTerrns; count++) { if (count % 2)
19 printf("The approximate value of pi is %f\n11 , pi); 20
Figure 13.17 A program to calculate :n:
13.4.2 Problem 2: Finding Prime Numbers Less than 100
Our next problem-solving example involves finding all the prime numbers that are
less than 100. Recall that a number is prime only if the only numbers that evenly divide it are 1 and itself.
");
Step 1
Start
Display all prime numbers less than 100.
Stop
Step 2
Start
Initialize
Step 3
Start
Initialize Num=2
F Num <= 100
?
CalcPrime
Num = Num + 1
Stop
Calculation
Stop
T
I ''
I·
Figure 13.18 Decomposing a problem to compute prime numbers less than 100. The first three steps involve creating a loop that iterates between the 2 and 100
Step 0, as with our previous examples, is to select an appropriate data repre- sentation for the various data associated with the problem. Since the property of prime numbers only applies to integers, using the integer data type for the main computation seems a good choice.
Next we apply stepwise refinement to the problem to reduce it into a C pro- gram. We can approach this problem by first stating it as a single task (step 1). We then refine this single task into two separate sequential subtasks: Initialize and then perform the calculation (step 2).
Performing the Calculation subtask is the brunt of the programming effort. Essentially, the Calculation subtask can be stated as follows: We want to check every integer between 2 and 100 to determine if it is prime. If it is prime, we want to print it out. A counter-controlled loop should work just fine for this purpose. We can further refine the Calculation subtask into smaller subtasks, as shown in Figure 13.18. Notice that the flow diagram has the shape of a f o r loop.
Already, the problem is starting to resolve into C code. We still need to refine the CalePrime subtask. In this subtask, we need to determine if the current number is prime or not. Here, we rely on the fact that any number between 2 and 100 that is not prime will have at least one divisor between 2 and 10 that is not it~elf. We
13.4
--------------.... Problem Solving Using Control Structures 363
364
chapter 13
Control Structures
Step 3
Start
Initialize Num=2
F
,,
Divide Num by integers 2 thru 1O
Num <= 100 ?
T
CalcPrime
Cale Prime
Num= Num+ 1
Stop
' '
Num is prime.
Print it out.
Figure 13.19 Decomposing the CalcPrime subtask
can refine this subtask as shown in Figure 13.19. Basically, we will determine if each number is divisible by an integer between 2 and 10 (being careful to exclude the number itself). If it has no divisors between 2 and 10, except perhaps itself, then the number is prime.
Finally, we need to refine the Divide number by integers 2 through IO subtask. It involves dividing the current number by all integers between 2 and 10 and determining if any of them evenly divide it. A simple way to do this is to use another counter-controlled loop to cycle through all the integers between 2 and
JO. Figure 13.20 shows the decomposition using the iteration construct.
Now, coding this problem into a C program is a small step forward. The program is listed in Figure 13.21. There are two f o r loops within the program, one of which is nested within the other. The outer loop sequences through all the integers between 2 and 100; it corresponds to the loop created when we decomposed the Calculation subtask. An inner loop determines if the number generated by the outer loop has any divisors; it corresponds to the loop created
when we decomposed the Divide number by integers 2 through JO subtask.
No F divisors? > – –
T
CalcPrime
‘ ‘ ‘ ‘
‘
Divide Num by integers 2 thru 1O
Initialize
divisor= 2
F
Cale Num / Div
Divisor= Divisor+ 1
No F divisors?
T
Num is prime
‘
‘
Figure 13.20 Decomposing the Divide numbers by integers 2 through 10 subtask
1 #include
3 #define TRUE 1
4
5 int main()
6{
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23
int num;
int divisor; int prime;
/* Start at 2 and go until 100 •/ for (num = 2; num <= 100; num++) {
prime= TRUE; /• Assume the number is prime*/
/* Test if the candidate number is a prime•/
for (divisor= 2; divisor<= 10; divisor++)
i f
(prime)
printf{"The number %dis prime\n11 ,
'
'
'
''' '
13.4
Problem Solving Using Control Structures 365
if (((num % divisor) == O) && num != divisor) prime= FALSE;
Figure 13.21 A program that finds all prime numbers between 2 and 100
Divisor <= 10 ?
num};
366
chapter 13 Control Structures
One item of note: If a divisor between 2 and 10 is found, then a flag variable calledprime issettofalse. Itissettotruebeforetheinnerloopbegins.Ifitremains true, then the number generated by the outer loop has no divisors and is therefore prime. To do this, we are utilizing the C preprocessor's macro substitution facility. We have defined, using #define, two symbolic names, FALSE, which maps to the value Oand TRUE, which maps to 1. The preprocessor will simply replace each occurrence of the word TRUE in the source file with I and each occurrence of FALSE with 0.
13.4.3 Problem 3: Analyzing an E-mail Address
Our final problem in this section involves analyzing an e-mail address typed in at the keyboard to determine if it is of valid formal. For this problem, we'll use a simple definition of validity: an e-mail address is a sequence of characters that must contain an at sign, "@", and a period, ".", with the at sign preceding the period.
As before, we start by choosing an appropriate data representation for the under!ying data of the problem. Here, we are processing text data entered by the user. The type best suited for text is the ASCII character type, c h a r . Actu- ally, the best representation for input text is an array of characters, or character string, but as we have not yet introduced arrays into our lexicon of primitive ele- ments (and we will in Chapter 16), we instead target our solution to use a single variable of the c h a r type.
Next, we apply stepwise refinement. The entire process is diagrammed in Figure 13.22. We start with a rough flow of the program where we have two
Step 1
Start
Process input
Output result
Stop
Step 2
Start
Process
next char
y
' ''
Step3
Get next char
Process next char
Check for At
Check for Dot after At
More?
Output results
Stop
Figure 13.22 A stepwise refinement of the analyze e-mail address program
' N'
'
''
tasks (step 1): Process input and Output results. Here, the Output results task is straightforward. We will output either that the input text is a valid e-mail address or that it is invalid. The Process input task requires more refinement.
In decomposing the Process input task (step 2), we need to keep in mind that our choice of data representation (variable of the c h a r type) implies that we will need to read and process the user's input one character at a time. We will keep processing, character by character, until we have reached the end of the e-mail address, implying that we select some form of sentinel-controlled loop. Step 2 of the decomposition divides the Process input task into a sentinel-controlled itera- tion construct that terminates when the end of an e-mail address is encountered, which we'll say is either a space or a newline character, \n.
The next step (step 3) of the decomposition involves detailing what processing occurs within the loop. Here, we need to check each character within the e-mail address and remember if we have seen an at sign or a period in the proper order. To do this, we will use two variables to record this status. When the loop terminates
1 #include
2 #define FALSE 0 3 #define TRUE 1 4
5 int main()
6{
7 char nextChar;
8 int gotAt FALSE; /* Indicates if At@ was found */
9 int gotDot =FALSE; /* Indicates if Dot . was found*/
10
11 printf (“Enter your e-mail address:
12
13 do{
14
&nextChar);
scanf(11%c11 ,
if (nextChar == ‘@’)
gotAt =TRUE; if (nextChar ==
&& gotAt
TRUE)
gotDot = TRUE;
while (nextChar !=
‘ && nextChar != ‘\n’};
if (gotAt == TRUE && gotDot == TRUE)
13.4 Problem Solving Using Control Structures 367
/* Next character in e-mail address*/
11
);
15
16
17
18
19
20
21
22
23
24
25
26 else
27 printf (“Your e-mail address is not valid!\n 11 ) ; 28
printf (11 Your e-mail address appears to be valid. \n”);
Figure 13.23 AC program to determine if an e-mail address is valid
368 chapter 13 Control Structures
and we are ready to display the result, we can examine these variables to display the appropriate output message.
At this point, we are not far from C code. Notice that the loop structure is very similar to the flow diagram ofthe do-while statement. The C code for this problem is provided in Figure 13.23.
13.S RdditionalCControlStructures
We complete our coverage of the C control structures by examining the switch, break, and continue statements. These three statements provide specialized program control that programmers occasionally find useful for very particular programming situations. We provide them here primarily for completeness; none of the examples in the remainder of the textbook use any of these three constructs.
13.5.1 The switch Statement
Occasionally, we run into programming situations where we want to perform a series of tests on a single value. For example, in the following code, we test the character variable keyPress to see ifit equals a series ofparticular characters.
char keyPress;
if (keyPress == ‘a’) /• statement A •/
else if (keyPress ‘b’) /• statement B •/
else if (keyPress ‘x’) /• statement C •/
else if (keyPress ‘y’) /* statement D •/
In this code, one (or none) of the statements labeled A, B, C, or D will execute, depending on the value of the variable keyPress. If keyPress is equal to the character a, then statement A is performed, if it is equal to the character b, then statement B is performed, and so forth. If keyPress does not equal a orb or x or y, then none of the statements are executed.
If there are many of these conditions to check, then many tests will be required in order to find the “matching” one. In order to give the compiler an opportunity to better optimize this code by bypassing some of this testing, C provides the switch statement. The following code segment behaves the same as the code in the previous example. It uses a switch statement instead of cascaded if-else statements.
char keyPress; switch (keyPress) {
case ‘a 1
/* statement A*/
break;
case ‘b’ :
/* statement B */
break; case ‘x’ :
/* statement C */ break;
case ‘y ‘:
/* statement D */ break;
}
Notice that the switch statement contains several lines beginning with the keyword case, followed by a label. The program evaluates keyPress first. Then it determines which of the following case labels matches the value ofkeyPress. If any label matches, then the statements following it are executed.
Let’s go through the switch construct piece by piece. The switch keyword precedes the expression on which to base the decision. This expression must be o f i n t e g r a l t y p e ( f o r e x a m p l e , a n i n t .or a c h a r ) . I f o n e o f t h e c a s e l a b e l s m a t c h e s the value of the expression, then program control passes to the statement or block associated with (usually, immediately below) that c a s e label. Each c a s e consists of a sequence of zero or more statements similar to a compound statement, but no delimiting braces are required. The place within this compound statement to start executing is determined by which case matches the value of the switch expression.Eachcase labelwithinaswitchstatementmustbeunique;identical labels are not allowed.
Furthermore, each case label must be a constant expression. It cannot be based on a value that changes as the program is executing. The following is not a legal case label (assuming i is a variable):
case i:
In the preceding switch example, each case ends with a break statement. The break exits the switch construct and changes the flow of control directly to the statement after the closing brace of the switch. The break statements are optional. If they are not used, then control will go from the current case to the
:
13.5 Additional C Control Structures 369
370
chapter 13 Control Structures
next. For example, if the break after statement C were omitted, then a match on case ‘x’ would cause statement C and statement D to be executed. However, in practice, cases almost always end with a break.
We can also include a default case. This case is selected if the switch expression matches none of the case constants. If no default case is given, and the expression matches none of the constants, none of the cases are executed.
A stylistic note: The last case of a switch does not need to end with a break since execution of the switch ends there, anyway. However, including a break for the final case is good programming practice. If another case is ever added to the end of the switch, then you will not have to remember to also add the break to the previous case. It is good, defensive programming.
13.5.2 The break and continue Statements
In the previous section, we saw an example of how the C break statement is used with switch. The break statement, and also the continue statement, are occasionally used with iteration constructs.
The break statement causes the compiler to generate code that will prema- turely exit a loop or a switch statement. When used within a loop body, break causes the loop to terminate by causing control to jump out of the innermost loop that contains it. The continue statement, on the other hand, causes the com- piler to generate code that will end the current iteration and start the next. These statements can occur within a loop body and apply to the iteration construct imme- diately enclosing them. Essentially, the break and continue statements cause the compiler to generate an unconditional branch instruction that leaves the loop from somewhere in the loop body. Following are two example code segments that use break and continue.
/* This code segment produces the output: O 1 2 3 4 */ for (i = 0; i < 1O, i++) {
if (i == 5) break;
printf(11%d 11 ,
/* This code produces the output: O 1 2 3 4 6 7 8 9 */ for Ii = O; i < 10; i++) {
if (i == 5) continue;
printf(11%d 11 i)i
13.5.3 An Example: Simple Calculator
The program in Figure 13.24 performs a function similar to the calculator exam- ple from Chapter 10. The user is prompted for three items: an integer operand,
i);
1 #include
2
3 int main(I 4{
13.5 Additional C Control Structures 371
/* Input values */ /* Result of the operation */ /* operation to perform */
5 6 7 8 9
int operandl, operand2i int result= O;
char operation;
I* Get the input values*/
printf( 11 Enter first operand:
scan£ (11 %d11 , &operandl) ;
printf (“Enter operation to perform (+, – scanf (11 \n%c”, &operation);
/* Perform the calculation*/ switch(operation) {
case ‘+’ :
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25 break; 26
11 ) ;
p rin tf( 11 Enter second operand: “); scanf (11 %d”, &operand2);
result= operandl + operand2; break;
case ‘-‘:
result operandl – operand2;
27 case ‘*’:
28 result operandl * operand2; 29 break;
30
31
32
33
34
35
36
37
38 default:
39 printf( 11 Invalid operation!\n”);
40 break;
41
42
43 print£ (“The answer is %d\n11 , 44
Figure 13.24 Calculator program in C
an operation to perform, and another integer operand. The program performs the operation on the two input values and displays the result. The program makes use of a switch to base its computation on the operator the user has selected.
case ‘/’:
if (operand2 != O)
/* Error-checking code. */ result= operandl / operand2;
else
p r i n t f ( 11 D i v i d e b y O e r r o r ! \ n 11 ) ;
break;
result) i
*, /) : “) ;
372
chapter 13 Control Structures
Exercises
13.6 Summarq
We conclude this chapter by summarizing the key concepts we’ve covered. The basic objective 0£ this chapter was to enlarge our set of problem-solving primitives by exploring the various control structures supported by the C programming language.
• Decision Construct in C. We covered two basic C decision statements: if and if – else. Both of these statements conditionally execute a statement depending on whether a specified expression is true or false.
• Iteration Constructs in C. C provides three iteration statements: while, for, and do-while. All of these statements execute a statement possibly multi- ple times until a specified expression becomes false. The while and do-while statements are particularly well-suited for expressing sentinel-controlled loops.
The f o r statement works well for expressing counter-controlled loops.
• Problem Solving Using Control Structures. To our arsenal of primitives for problem solving (which already includes the three basic C types, variables, operators, and I/O using printf and scanf), we added control constructs. We practiced some problem-solving examples that required application of these control constructs.
13.1 Recreate the LC-3 compiler’s symbol table when it compiles the calculator program listed in Figure 13.24.
13.2 a.
b. c.
What docs the following code look like after it is processed by the preprocessor?
#define VERO -2
i f (VERO)
p r i n t f ( 11 T r u e ! 11
else
printf(11False!11 );
What is the output produced when this code is run?
U we modified the code to the following, does the code behave differently? If so, how?
#define VERO -2
i f (VERO)
printf (“True! 11 ) ;
else if (!VERO) printf( 1’False! 11 ) ;
);
13.3 Anif-else statementcanbeusedinplaceoftheCconditional operator (see Section 12.6.3). Rewrite the following statement using an i f – e l s e rather than the conditional operator.
x =a? b : c;
13.4 Describe the behavior of the following statements for the case when x
equals Oand when x equals 1.
a. if (x = 0)
printf (“x equals O\n”);
else
printf (“x does not equal O\n”);
b. if (x — 0)
printf (“x equals o\n”) ;
else
printf (nx does not equal O\n”) ;
C. if (x == O) printf(11 A\n11
)i else if (x != 1)
printf (“B\n”); else if (x < 1)
printf ("C\n" I ; else if (x)
printf(11D\n11 );
d. int Xi int y;
switch (x) case 0:
y=3;
case 1:
y = 4;
break;
default, y,a: 5; break;
}
e. What happens if x is not equal to Oor I for part 4?
13.5 Provide the LC-3 code generated by our LC-3 C compiler when it
compiles the switch statement in part 4 ofExercise 13.4.
13.6 Figure 13.12 contains a C program with a nested f o r loop.
a. Mathematically state the series that this program calculates.
b. Write a program to calculate the following function:
f(n)=f(n- ])+f(n- 2) with the following initial conditions,
.f(O) = 1, .f(l) = I
Exercises 373
374 chapter 13 Control Structures
13.7 Can the following if-else statement be converted into a switch? If
yes, convert it. If no, why not? if (x -- 0)
else if (x 2) y= 5;
else if (x y) y 6;
else
y 7·
'
13.8 At least how many times will the statement called loopBody execute the following constructs?
a. while (condition) loopBody;
b. do
loopBody;
while (condition);
c. for (init; condition; reinit) loopBody;
d. while (conditionl)
for {init; condition2; reinit)
loopBody;
e. do
do
loopBody;
while (conditionl);
while (condition2);
13.9 What is the output of each of the following code segments?
a. a = 2·
while'(a > 0) {
a- – ; printf (11 %d11
b. a = 2; do {
a–;
} while (a > 0)
y=3· ‘
else if (x 1) y=4;
printf (11 %d11 C. b = 0;
‘
a) ;
,
a);
for (a = 3; a < 10; a+= 2) b = b + 1 '·
printf(11 %d %d11, a, b);
13.10 Convert the program in Figure 13.4 into one that uses a s w i t c h statement instead of if-else.
13.11 Modify the e-mail address validation program in Figure 13.23 so that it requires that at least one alphabetic character appears prior to the at sign, one appears between the at sign and the period, and one appears after the period in order for an e-mail address to be valid.
13.12 For the following questions, xis an integer with the value 4.
a. What output is generated by the following code segment?
if (7 > X > 2) printf (11 True. 11 ) ;
else
printf (“False. 11 ) ;
b. Docs the following code cause an infinite loop? while (x > 0)
x++;
c. What is the value of x after the following code has executed? for (x = 4; x < 4; x--I {
if(x< 2) break;
else if Ix 2) continue;
X = -1; }
13.13 Change this program so that it uses a do-while loop instead ofa for loop.
int mainI) {
.int i; int sum;
for Ii = O; i <= 100; i++) { if (i %4 == 0)
sum = sum - else if (i % 4
sum = sum + 2; else if (i % 4
sum = sum * 3; else if (i % 4 3) sum= sumI2;
printf (11 %d\n11
,
sum);
1) 6;
2)
Exercises 375
376
chapter 13 Control Structures
13.14
Write a C program that accepts as input a single integer k, then writes a pattern consisting of a single I on the first line, two 2s on the second line, three 3s on the third line, and so forth, until it writes k occurrences of k on the last line.
For example, if the input is 5, the output should be the following:
1
22
333 4444 55555
Convert the following while loop into a for loop.
while (condition) loopBody;
b. Convert the following for loop into a while loop. for (init; condition; reinit)
loopBody;
What is the output of the following code?
intr 0; int s = O; int w = 12; int sum = 0.,
for (r = l; r <= w; r++) for (s r; s <= w; s++)
sum= sum+ s; printf(11sum =%d\n11 ,
13.15 a.
13.16
13.17
The following code performs something quite specific. Describe its output.
inti;
scanf(11%d11 , &i);
for (j = o; j < 16; j++I
ifIi&I1«jII{ count++;
printf (11 %d\n", count);
sum);
13.18
Provide the output of each of the following code segments.
a.intx 20; int y = 10;
while ((x > 10) && (y & 15)) y=y+l;
X=X-1;
printf(“* 11 ) ;
}
b. int x;
for (x = 10; X ; X
printf(11*11 ); C. int x;
for (x = O; x < 10; x if (x %2)
p r i n t f ( 11 * 11 ) ;
}
d. int X = 0; inti;
while (x < 10) {
for (i = O; i < x; i printf(11*11) ;
X=X+l;
X - 1)
X + 1) {
X + 1)
Exercises 377
Functions
14.l Introduction
Functions are subprograms, and subprograms are the soul of modern program- ming languages. Functions provide the programmer with a way to enlarge the set of elementary building blocks with which to write programs. That is, they enable the programmer to extend the set of operations and constructs natively supported by the language to include new primitives. Functions are such an important con- cept that they have been part oflanguages since the very early days, and support for them is provided directly in all instruction set architectures, including the LC-3.
Why are they so important? Functions (or procedures, or subroutines, or methods-all of which are variations of the same theme) enable abstraction. That is, they increase our ability to separate the "function" of a component from the details of how it accomplishes that "function." Once the component is created and we understand its construction, we can use the component as a building block without giving much thought to its detailed implementation. Without abstraction, our ability to create complex systems such as computers, and the software that runs on them, would be seriously impaired.
Functions are not new to us. We have have been using variants of functions ever since we programmed subroutines in LC-3 assembly language; while there are syntactic differences between subroutines in LC-3 assembly and functions in C, the concepts behind them are largely the same.
The C programming language is heavily oriented around functions. A C pro- gram is essentially a collection of functions. Every statement belongs to one (and only one) function. All C programs start and finish execution in the function main.
chapter
14
380
chapter 14 Functions
The function main might call other functions along the way, and they might, in tum, call more functions. Control eventually returns to the function main, and when m a i n ends, the program ends (provided something did not cause the program to terminate prematurely).
In this chapter, we provide an introduction to functions in C. We begin by examining several short programs in order to get a sense of the C syntax involving functions. Next, we examine how functions are implemented, examining the low- level operations necessary for functions to work in high-level languages. In the last part of the chapter, we apply our problem-solving methodology to some programming problems that benefit from the use of functions.
14.2 FunctionsinC
Let's start off with a simple example of a C program involving functions. Figure 14.l is a program that prints a message using a function named PrintBanner. This program begins execution at the function main, which then calls the function PrintBanner. This function prints a line of text consisting of the - character to the output device.
PrintBanner is the simplest form of a function: it requires no input from its caller to do its job, and it provides its caller with no output data (not counting the banner printed to the screen). In other words, no argument~ are passed from main to PrintBanner and no value is returned from PrintBanner to main. We refer to the function main as the caller and to Print:cBanner as the callee.
14.2.1 A Function with a Parameter
The fact that Print:cBanner and main require no exchange of information sim- plifies their interface. In general, however, we'd like to be able to pass some information between the caller and the callee. The next example demonstrates
1 #include
2
3 void PrintBanner();
4
5 int main()
6{
7 PrintBanner();
/* Function declaration*/
/* Function call 8 printf (“A simple C program. \n”);
9 PrintBanner(); 10 }
11
12 void PrintBanner() 13 {
*/
/* Function definition */
14 printf(“—————————-\n”};
15 }
Figure 14.1 AC program that uses a function to print a banner message
1 #include
2
3 int Factorial(int n); 4
5 int main()
6{
7
8
9
/*! Function Declaration !*/ /* Definition for main */
int number; int answer;
/* Number from user
/* Answer of factorial */
/* Call to printf */ /* Call to scanf */ /*! Call to factorial !*/
%dis %d\n11 , number, answer);
/*! Function Definition !*/ I* Iteration count */
Initialized result */ /* Calculate factorial */
printf (11 Input a number: “) ;
10
11
12
13
14
15
16
17
18
19 int Factorial(int n) 20
21 int i;
22 int result~ 1;
23
24 for (i = 1; i <= n; i++)
25 result= result* i;
26
27 return result; 28 }
Figure 14.2 AC program to calculate factorial
scanf ("%d11
&number) ; answer= Factorial(number);
,
printf( 11 The factorial of
how this is done in C. The code in Figure 14.2 contains a function Factorial that performs an operation based on an input parameter.
F a c t o r i a l performs a multiplication o f all integers between I a n d n , where n is the value provided by the caller function (in this case main). The calculation performed by this function can be algebraically stated as:
factorial(n) = n! = 1 x 2 x 3 x . . . x n
The value calculated by this function is named r e s u l t in theC code in Figure 14.2. Its value is returned (using the return statement) to the caller. We say that the function Factorial requires a single integer argument from its caller, and it returns an integer value back to its caller. In this particular example, the variable answer in the caller is assigned the return value from Factorial (line 14).
Let's take a closer look at the syntax involved with functions in C. In the code in Figure 14.2, there are four lines that are of particular interest to us. The declaration for Factorial is at line 3. Its definition starts at line 19. The call to F a c t o r i a l is at line 14; this statement invokes the function. The return from Factoria 1 back to its caller is at line 27.
I*
/*! Return to caller
!*/
14.2 Functions in C
381
*/
382
chapter 14 Functions
The Declaration
In the preceding example, the function declaration for Factorial appears at line 3. What is the purpose of a function's declaration? It informs the com- piler about some relevant properties of the function in the same way a variable's declaration informs the compiler about a variable. Sometimes called a function prototype, a function declaration contains the name of the function, the type of value it returns, and a list of input values it expects. The function declaration ends with a semicolon.
The first item appearing in a function's declaration is the type of the value the function returns. The type can be any C data type (e.g., int, char, double). This type describes the type of the single output value that the function produces. Not all functions return values. For example, the function PrintBanner from the previous example did not return a value. If a function docs not return a value, then its return type must be declared as void, indicating to the compiler that the function returns nothing.
The next item on the declaration is the function's name. A function's name can be any legal C identifier. Often, programmers choose function names some- what carefully to reflect the actions performed by the function. Factorial, for example, is a good choice for the function in our example because the mathemat- ical term for the operation it performs is factorial. Also, it is good style to use a naming convention where the names of functions and the names of variables are easily distinguishable. In the examples in this book, we do this by capitalizing the first character of all function names, such as Factorial.
Finally, a function's declaration also describes the type and order of the input parameters required by the function. These are the types of values that the function expects to receive from its callers and the order in which it expects to receive them. We can optionally specify (and often do) the name of each parameter in the declaration. For example, the function F a c t o r i a l takes one integer value as an input parameter, and it refers to this value internally as n. Some functions may not require any input. The function PrintBanner requires no input parameters; therefore its parameter list is empty.
The Call
Line 14 in our example is the function call that invokes Factorial. In this state- ment, the function main calls Factorial. Before Factorial can start, however, main must transmit a single integer value to Factorial. Such values within the caller that are transmitted to the callee are called arguments. Arguments can be any legal expression, but they should match the type expected by the callee. These arguments are enclosed in parentheses immediately after the callee's name. In this example, the function main passes the value of the variable number as the argu- ment. The value returned by F a c t o r i a l is then assigned to the integer variable answer.
The Definition
The code beginning at line 19 is the function definition for F a c t o r i a l . Notice that the first line ofthe definition matches the function declaration (however, minus the
semicolon). Within the parentheses after, the name of the function is the function's formal parameter list. The formal parameter list is a list of variable declarations, where each variable will be initialized with the corresponding argument provided by the caller. In this example, when Factorial is called on line 14, the parameter n will be initialized lo the value of number from main. From every place in the program where a function is called, the actual arguments appearing in each call
should match the type and ordering of the formal parameter list.
The function's body appears in the braces following the parameter list. A function's body consists of declarations and statements that define the computa- tion the function performs. Any variable declared within these braces is local to
the function.
A very important concept to realize about functions in C is that none of the
local variables of the caller are explicitly visible by the callee function. And in particular, Factorial cannot modify the variable number. In C, the arguments of the caller are passed as values to the callee.
The Return Value
In line 27, control passes back from Factorial to the caller main. Since Factorial is returning a value, an expression must follow the return key- word, and the type of this expression should match the return type declared for thefunction.InthecaseofFactorial,thestatementreturn result;transmits the calculated value stored in result back to the caller. In general, functions that return a value must include at least one return statement in their body. Func- tions that do not return a value-functions declared as type void-do not require a return statement; the return is optional. For these functions, control passes back to the caller after the last statement has executed.
What about the function main? Its type is int (as required by the ANSI standard), yet it docs not contain a return. Strictly speaking, we should include a return o at the end of main in the examples we've seen thus far. In C. if a non-void function does not explicitly return a value, the value of the last state- ment is returned to the caller. Since main's return value will be ignored by most callers (who are the callers of main?), we've omitted them in the text to make our examples more compact.
Let's summarize these various syntactic components: A function declaration (or prototype) informs the compiler about the function, indicating its name, the number and types of parameters the function expects from a caller, and the type of value the function returns. A function definition is the actual source code for the function. The definition includes a formal parameter list, which indicates the names of the function's parameters and the order in which they will be expected from the caller. A function is invoked via a function call. Input values, or argu- ments, for the function are listed within the parentheses of the function call. Literally, the value of each argument listed in the function call is assigned to the corresponding parameter in the parameter list, the first argument assigned to the first parameter, the second argument to the second parameter, and so forth. The return value is the output of the function, and it is passed back to the caller function.
14.2 Functions in C 383
384
chapter 14 Functions
14.2.2 Example: Area of a Ring
We further demonstrate C function syntax with a short example in Figure 14.3. This C program calculates the area of a circle that has a smaller circle removed from it. In other words, it calculates the area of a ring with a specified outer and inner radius. In this program, a function is used to calculate the area of a circle with a given radius. The function AreaOfCircle takes a single parameteroftype double and returns a double value back to the caller.
The following point is important for us to reiterate: when function AreaOfCircle is active, it can "see" and modify its local variable pi and its parameter radius. It cannot, however, modify any of the variables within the function main, except via the value it returns.
The function AreaOfCircle in this example has a slightly different usage than the functions that we've seen in the previous examples in this chapter. Notice thattherearemultiplecallstoAreaOfCircle fromthefunctionmain. Inthiscase, A r e a O f C i r c l e performs a useful, primitive computation such that encapsulating it into a function is beneficial. On a larger scale, real programs will include func- tions that are called from hundreds or thousands of different places. By forming
10
11
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
double areaOfRing;
printf("Enter inner radius: 11 }; scanf {11 %lf", &outer);
printf("Enter outer radius: 11 ) ; scant. ("%lf", &inner) ;
1 #include
2
3 /* Function declarations*/
4 double AreaOfCircle(double radius); 5
6 int main()
7{
8
9
27
28 }
return p i* radius* radius;
double outer;
/* Inner radius*/ /* Outer radius*/ /* Area of ring*/
double inner;
areaOfRing = AreaOfCircle(outer) – AreaOfCircle(inner);
printf(“The area of the ring is %f\n11 , areaOfRing); }
/* Calculate area of circle given a radius*/ double AreaOfCircle(double radius)
{
double pi 3.14159265;
Figure 14.3 A C program calculates the area of a ring
AreaOfCircle and similar primitive operations into functions, we potentially save on the amount of code in the program, which is beneficial for code main- tenance. The program also takes on a better structure. With AreaOfCircle, the intent of the code is more visibly apparent than if the formula were directly embedded in-line.
Some of you might remember our discussion on constant values from Section 12.6.2, where we argue that the variable pi should be declared as a constant using the const qualifier on line 25 ofthe code. We omit it here to make the example accessible to those who that might have skipped over the Additional Topics section of Chapter 12.
14.3 ImplementingFunctionsinc
Let’s now take a closer look at how functions in Care implemented at the machine level. Functions are the C equivalent of subroutines in LC-3 assembly language (which we discussed in Chapter 9), and the core of their operation is the same. In C, making a function call involves three basic steps: (1) the parameters from the caller are passed to the callee and control is transfered to the callee, (2) the callee does its task, (3) a return value is passed back to the caller, and control returns to the caller. An important constraint that we will put on the calling mechanism is that a function should be caller-independent. That is, a function should be callable from any function. In this section we will examine how this is accomplished using the LC-3 to demonstrate.
14.3.1 Run-Time Stack
Before we proceed, we first need to discuss a very important component of functions in C and other modern programming languages. We require a way to “activate” a function when it is called. That is, when a function starts executing, its local variables must be given locations in memory. Let us explain:
Each function has a memory template in which its local variables are stored. Recall from our discussion in Section 12.5.2 that an activation record for a function is a template of the relative positions of its local variables in memory. Each local variable declared in a function will have a position in the activation record. Recall that the frame pointer (R5) indicates the start of the activation record. Question:
Where in memory does the activation record of a function reside? Let’s consider some options.
Option 1: The compiler could systematically assign spots in memory for each function to place its activation record. Function A might be assigned memory location X to place its activation record, function B might be assigned location Y, and so forth, provided, of course, that the activation records do not overlap. While this seems like the most straightforward way to manage the allocation, a serious limitation arises with this option. What happens if function A calls itself? We call this recursion, and it is a very important programming concept that we will discuss in Chapter 17. If function A calls itself, then the callee version of function A will overwrite the local values of the caller version of function A, and
• •
14.3 Implementing Functions in C
385
‘.1
q i.i
24 25 26 27 28
I! ,I
:1
/* Volta calls no one•/
‘
386
chapter 14 Functions
the program will not behave as we expect it to. For the C programming language, which allows recursive functions, option l will not work.
Option 2: Every time a function is called, an activation record is allocated for it in memory. And when the function returns to the caller, it~ activation record is reclaimed to be assigned later to another function. While this option appears to be conceptually more difficult than option 1, it permits functions to be recursive. Each invocation of a function gets its own space in memory for its locals. For example, if function A calls function A, the callee version will be allocated its own activation record for storing local values, and this record will be different than the caller’s. There is a factor that reduces the complexity of making option 2 work: The calling pattern of functions (i.e., function A calls B which calls C, etc.)
can be easily tracked with a stack data structure (Chapter I0). Let us demonstrate with an example.
The code in Figure 14.4 contains three functions, main, watt, and Volta. What each function does is not important for this example, so we’ve omitted some of their details but provided enough so that the calling pattern between them is
1 int main() 2{
‘
Watt (a); Volta(a, b) ;
10
11 int Watt(int a) 12 {
13
14
15
16
17
18
19
20
21
22
23
3
4
5
6
7 b 8 b 9
/* main calls both */
/* Watt calls Volta */
int a; int b·
int w;
w – Volta(w, 10);
return w;
int k; int m;
return k;
q, int r)
int V olta(int
Figure 14.4 Code example that demonstrates the stack-like nature of function calls
xoooo
Memory
xFFFF
(a) Run-time stack
(b) When watt executes
when execution starts
R6 R5
R6 RS
R6 RS
(c) When Volta executes
R6 RS
R6 RS
R6 RS
(d) After Volta completes
Figure 14.5 Several snapshots of the run-time stack while the program outlined in
Figure 14.4 executes
apparent. The function m a i n calls w a t t and w a t t calls V o l t a . Eventually, control returns back to main which then calls vo1 ta.
Each function has an activation record that consists of its local variables, some bookkeeping information, and the incoming parameters from the caller (we’ll mention more about the parameters and bookkeeping information in the subsequent paragraphs). Whenever a function is called, its activation record will be allocated somewhere in memory, and as we indicated in the previous paragraph, in a stack-like fashion. This is illustrated in the diagrams of Figure 14.5.
Each of the shaded regions represents the activation record of a particular function call. The sequence of figures shows how the run-time stack grows and shrinks as the various functions are called and return to their caller. Keep in mind that, as we push items onto the stack, the top of the stack moves, or “grows,” toward lower-numbered memory locations.
Figure 14.5(a) is a picture of the run-time stack when the program starts execution. Since the execution ofa C program starts in main, the activation record
(e) After Watt completes
(I) When Volta executes
14.3
Implementing Functions in C 387
386
chapter 14 Functions
for main is the first Lo be allocated on the stack. Figure 14.5(b) shows the run- time stack immediately after Watt is called by main. Notice that the activation records are allocated in a stack-like fashion. That is, whenever a function is called, its activation record is pushed onto the stack. Whenever the function returns, its activation is popped off the stack. Figure 14.5 part~ (c) through (f) show the state of the run-time stack at various points during the execution of this code. Notice that R5 points to some internal location within the activation record (it points to the base of the local variables). Also notice how R6 always points to the very top of the stack-it is called the stack pointer. Both of these registers have a key role to play in the implementation of the run-time stack and of functions in C in general.
14.3.2 Getting It All to Work
It is clear that there is a lot of work going on at the machine level when a function is called. Parameters must be passed, activation records pushed and popped, control moved from one function to another. Some of this work is accomplished by the caller, some by the callee.
To accomplish all of this, the following steps are required: First, code in the caller function copies its arguments into a region of memory accessible by the callee. Second, the code at the beginning of the callee function pushes its activation record onto the stack and saves some bookkeeping information so that when control returns to the caller, it appears to the caller as if its local variables and registers were untouched. Third, the callee does its thing. Fourth, when the callee function has completed its job, its activation record is popped off the run- time stack and control is returned to the caller. Finally, once control is back in the caller, code is executed to retrieve the callee’s return value.
Now we’ll examine the actual LC-3 code for carrying out these operations. We do so by examining the LC-3 code associated with the following function call:w – V olta(w , 10); from line 18ofthecodeinFigure 14.4.
The Call
In the statement w ~ V olta (w, 10) ; , the function V olta is called with two arguments. The value returned by Volta is then assigned to the local integer variable w. In translating this function call, the compiler generates LC-3 code that does the following:
1. TransmitsthevalueofthetwoargumentstothefunctionVoltabypushing them directly onto the top of the run-time stack. Recall that R6 points to the top of the run-time stack. That is, it contains the address of the data item currently at the top of the run-time stack. To push an item onto the stack, we first decrement R6 and then store the data value using R6 as a base address. In the LC-3, the arguments of a C function call are pushed onto the stack from right-to-left in order they appear in the function call. In the case of Watt, we will first push the value 10 (rightmost argument) and then the value of w.
2. Transfers control to V o l t a via the J S R instruction.
xoooo
R6
Memory
value of w
10
____________t______ Local variable
of Watt
Activation record of Watt
____________i______
R5 w
X FFFF
Figure 14.6 The run-time stack W att pushes the values it wants to pass to V olta
The LC-3 code to perform this function call looks like this:
AND RO,RO,#0 RO<-0 ADD RO, RO, #10 RO <- 10 ADD R6, R6, #-1
STR RO, R6, #0
LDR RO, RS, #O ADD R6, R6, #-1 STR RO, R6, #0
JSR Volta
Load w Push w
Push 10
Figure 14.6 illustrates the modifications made to the run-time stack by these instructions. Notice that the argument values are pushed immediately on top of the activation record of the caller (watt). The activation record for the callee (Volta) willbeconstructedonthestackdirectlyontopoftherecordofthecaller.
Starting the Callee Function
The instruction executed immediately after the JSR in the function watt is the first instruction in the callee function V o l t a .
The code at the beginning of the callee handles some important bookkeeping associated with the call. The very first thing is the allocation of memory for the return value. The callee will push a memory location onto the stack by decre- menting the stack pointer. And this location will be written with the return value prior to the return to the caller.
Parameters for V o l t a
14.3 Implementing Functions in C 389
'
I :(
ii !,I, I
Figure 14.7 summarizes the changes to memory accomplished by the code we have encountered so far. The layout in memory of these two activation records- one for Watt and one for V olta-is apparent. Notice that some entries of the activation record of vol ta are written by Watt. In particular, these are the param- eter fields of V olta's activation record. Watt writes the value of its local variable w as the first parameter and the value 10 for the second parameter. Keep in mind that these values are pushed from right to left according to their position in the function call. Therefore, the value of w appears on top of the value 10. Once
I
390
chapter 14 Functions
Next, the callee function saves enough information about the caller so that eventually when the called has finished, the caller can correctly regain program control. In particular, we will need to save the caller's return address, which is in R7 (Why is it in R7? Recall how the JSR instruction works.) and the caller's frame pointer, which is in R5. It is important to make a copy of the caller's frame pointer, which wc call the dynamic link, so that when control returns to the caller it will be able once again to access its local variables. If either the return address or the dynamic link is destroyed, then we will have trouble restarting the caller correctly when the callee finishes. Therefore it is important that we make copies of both in memory.
Finally, when all of this is done, the callee will allocate enough space on the stack for its local variables by adjusting R6, and it will set R5 to point to the base of its locals.
To recap, here is the list of actions that need to happen at the beginning of a function:
I. The callee saves space on the stack for the return value. The return value is located immediately on top of the parameters for the callee.
2. The callee pushes a copy of the return address in R7 onto the stack.
3. The callee pushes a copy of the dynamic link (caller's frame pointer) in R5 onto the stack.
4. The callee allocates enough space on the stack for its local variables and adjusts R5 to point to the base of the local variables and R6 to point to the top of the stack.
The code to accomplish this for Volta is: V olta,
ADD R6, R6, #-1
ADD R6, R6, #-1 STR R7, R6, #0
ADD R6, R6, #-1 STR RS, R6, #0
ADD RS, R6, #-1 ADD R6, R6, #-2
Allocate spot for the return value Push R7 (Return address)
Push RS (Caller's frame pointer) We call this the dynamic link
Set new frame pointer
Allocate memory for V olta's locals
xOOOO
R6
RS ~
m
k
watt's frame pointer Return address for watt Return value to watt
q (value of w) r (10)
w.
main's frame pointer Return address for main
Return value to main a
,~-~:~ .. r......
xFFFF
Figure 14.7
The run-time stack after the activation record for V ol ta is pushed onto the stack
invoked, V o l t a will refer to these values with the names g and r . Question: What are the initial values ofvolta's local variable? Recall from Chapter 11 that local variables such as these are uninitialized. See Exercise 14.10 for an exercise on the initial values of local variables.
Notice that each activation record on the stack has the same structure. Each activation record contains locations for the function's local variables, for the bookkeeping information (consisting of the caller's return address and dynamic link), the return value, and the function's parameters.
Ending the Callee Function
Once the callee function has completed its work, it must perform several tasks prior to returning control to the caller function. Firstly, a function that returns a value needs a mechanism for the return value to be transmitted properly to the caller function. Secondly, the callee must pop the current activation record. To enumerate,
1. If !here is a return value, it is written into the return value entry of the activation record.
2. The local variables are popped off the stack.
3. The dynamic link is restored.
4. The return address is restored.
5. The R E T instruction returns control to the caller function.
14.3 Implementing Functions in C 391
Bookkeeping info
Parameters
---------------- t------- Activation record
for Watt
---------------- +_______
Activation record
for V o l t a
i
392
chapter 14 Functions
The LC-3 instructions corresponding to this for Volta are
LDR RO, RS, STR RO, RS,
ADD R6, RS,
# 0 # 3
# 1
Load local variable k
Write it in return value slot
Pop local variables Pop the dynamic link
Pop the return address
LDR RS, ADD R6,
LDR R7, ADD R6,
RET
R6, #0 R6, #1
R6, #0 R6, #1
The first two instructions write the return value, which in this case is the local variable k, into the return value entry of vol ta's activation record. Next, the local variables are popped by moving the stack pointer to the location immediately below the frame pointer. The dynamic link is restored, then the return address is restored, and finally we return to the caller.
You should keep in mind that even though the activation record for Volta is popped off the stack, the values remain in memory.
Returning to the Caller Function
After the callee function executes the RET instruction, control is passed back to the caller function. In some cases, there is no return value (if the callee is declared of type void) and, in some cases, the caller function ignores the return value. Again, from our previous example, the return value is assigned to the variable w in Watt.
In particular, there are two actions that must be performed:
I. The return value (if there is one) is popped off the stack. 2. The arguments are popped off the stack.
The code after the JSR looks like the following:
JSR Volta
LDR RO, R6, #0
STR RO, RS, #0 ADD R6, R6, #1
ADD R6, R6, #2
Load the return value at the top of stack
w = Volta(w, 10) ;
Pop return value
Pop arguments
Once this code is done, the call is now complete and the caller function can resume its normal operation. Notice that prior to the return to the caller, the callee restores the environment of the caller. To the caller, it appears as if nothing has changed except that a new value (the return value) has been pushed onto the stack.
Caller Save/Callee Save
Before we complete our discussion of the implementation of functions, we need to cover a topic that we've so far swept under the rug. During the execution of a function, RO through R3 can contain temporary values that are part of an ongoing computation. Registers R4 through R7 are reserved for other purposes: R4 is the pointer to the global data section, RS is the frame pointer, R6 is the stack pointer, and R7 is used to hold return addresses. If we make a function call, based on the calling convention we've described R4 through R7 do not change or change in predetermined ways. But what happens to registers RO, R 1, R2, and R3? In the gen- eral case, we'd like to make sure that the callee function does not overwrite them. To address this, calling conventions typically adopt one of two strategies: (I) The caller will save these registers by pushing them onto its activation record. This is called the caller-save convention. (We also discussed this in Chapter 9.) When control is returned to the caller, the caller will restore these registers by popping them off the stack. (2) Alternatively, the callee can save these registers by adding four fields in the bookkeeping area ofits record. This is called the callee-save con- vention. When the callee is initiated, it will save RO through R3 and RS and R7 into the bookkeeping region and restore these registers prior to the return to the caller.
14.3.3 Tying It All Together
ThecodeforthefunctioncallinWatt andthebeginningandendofvolta islisted in Figure 14.8. The LC-3 code segments presented in the previous sections are all combined, showing the overall structure of the code. This code is more optimized than the previous individual code segments. We've combined the manipulation of the stack pointer R6 associated with pushing and popping the return value into single instructions.
To summarize, our LC-3 C calling convention involves a series of steps that are performed when a function calls another function. The caller function pushes the value of each parameter onto the stack and performs a Jump To Subroutine (JSR) to the callee. The callee allocates a space for the return value, saves some bookkeeping information about the caller, and then allocates space on the stack for its local variables. The callee then proceeds to carry out its task. When the task is complete, the callee writes the return value into the space reserved for it, pops and restores the bookkeeping information, and returns to the caller. The caller then pops the return value and the parameters it placed on the stack and resumes its execution.
You might be wondering why we would go through all these steps just to make a function call. That is, is all this code really required and couldn't the calling convention be made simpler? One of the characteristics of real calling conventions is that in the general case, any function should be able to call any other function. To enable this, the calling convention should be organized so that a caller does not need to know anything about a callee except its interface (that is, the type of value the callee returns and the types of values it expects as parameters). Likewise, a callee is written to be independent of the functions that call it. Because of this generality, the calling convention for C functions require the steps we have outlined here.
14.3 Implementing Functions in C 393
394
chapter 14 Functions
1 2 3 4 5 6 7 8 9
10
11
12
13
14 15 16 17 18 19 20 21 22 23 24 25 26 27
28 29 30 31 32 33 34 35
Watt:
AND ADD ADD STR LDR ADD STR
JSR
LDR STR ADD
RO, RO, #0 RO <- 0 RO,RO,#10 RO<-10 R6, R6, #-1
RO, R6, #0
RO, RS, #0 R6, R6, #-1 RO, R6, #0
V olta
RO, R6, #0 RO, RS, #0 R6, R6, #3
Push 10 Load w
V olta:
ADD R6, R6, #-2 STR R7, R6, #0 ADD R6, R6, #-1 STR RS, R6, #0 ADD RS, R6, #-1 ADD R6, R6, #-2
LDR RO, R5, #0 STR RO, RS, #3 ADD R6, R5, #1 LDR RS, R6, #0 ADD R6, R6, #1 LDR R7, R6, #0 ADD R6, R6, #1 RET
Push w
Load the return value at top of stack w=Volta(w, 10);
Pop return value, arguments
Push return value
Push return address
Push RS (Caller's frame pointer) We call this the dynamic link Set new base pointer
Allocate memory for V olta's V olta performs its work
locals
Load local variable k
Write it in return value slot
Pop local variables Pop the dynamic link
Pop the return address
Figure 14.8 The LC-3 code corresponding to a C function call and return
14.4 ProblemSolvingUsingFunctions
For functions to be useful to us, we must somehow integrate them into our pro- gramming problem-solving methodology. In this section we will demonstrate the use of functions through two example problems, with each example demonstrating a slightly different application of functions.
Conceptually, functions are a good point of division during the top-down design of an algorithm from a problem. As we decompose a problem, natural "components" will appear in the tasks that are to be performed by the algorithm. And these components are natural candidates for functions. Our first exam- ple involves converting text from lowercase into uppercase, and it presents an
example of a component function that is naturally apparent during the top-down design process.
Functions are also useful for encapsulating primitive operations that the pro- gram requires at various spots in the code. By creating such a function, we are in a sense extending the set of operations of the programming language, tailor- ing them to the specific problem at hand. In the case of the second problem, which determines Pythagorean Triples, we will develop a primitive function to calculate x 2 to assist with the calculation.
14.4.1 Problem 1: Case Conversion
In this section, we go through the development o f a program that reads input from the keyboard and echos it back to the screen. We have already seen an example of a program that does just this in Chapter 13 (see Figure 13.8). However, this time, we throw in a slight twist: We want the program to convert lowercase characters into uppercase before echoing them onto the screen.
Figure 14.9
The decomposition into smaller subtasks of a program that converts input
characters into uppercase
14.4 Problem Solving Using Functions 395
Leave asis
Convert to
uppercase
Parameter:
Character to convert
FT
Lowercase?
Return value: Converted character
396 chapter 14 Functions
10
11
char echo = 'A'; char upcase;
/* Initialize input character*/ /* Converted character */
25
26
char outchar;
if {'a' <= inchar && inchar <= 'z') outchar inchar - ('a' - 'A');
1 #include
2
3 /* Function declaration*/ char ToUpper{char inchar);
4
5 /* Function main: */ 6 /* Prompt for a line of text, Read one character, */ 7 /* convert to uppercase, print it out, then get another */ 8 int main()
9{
12
13
14
15
16
17
18
19
20
21 /* If the parameter is lower case return 22 /* its uppercase ASCII value
23 char ToUpper(char inchar) 24 {
while (echo != ‘\n’) { scanf{11%c11 , &echo); upcase = ToUpper(echo);
printf{11 %c11
,
upcase);
I* Function ToUpper:
*I
*/
*I
27
28
29 else
30 outchar inchar; 31
32 return outchar;
33
Figure 14.10 A program with a function to convert lowercase letters to uppercase
Our approach to solving this problem is to use the echo program from Figure 13.8 as a starting point. The previous code used a while loop to read an input character from the keyboard and then print it to the output device. To this basic structure, we want to add a component that checks if a character is lowercase and converts it to uppercase if it is. There is a single input and a sin- gle output. We could add code to perform this directly into the while loop, but given the self-contained nature of this component, we will create a function to do this job.
The conversion function is called after each character is scanned from the keyboard and before it is displayed to the screen. The function requires a sin- gle character as a parameter and returns either the same character (for cases in which the character is already uppercase or is not a character of the alpha- bet) or it will return an uppercase version of the character. Figure 14.9 shows
the flow of this program. The flowchart of the original echo program is shaded. To this original flowchart, we are adding a component function to perform the conversion.
Figure 14.10 shows the complete C program. It takes input from the keyboard, converts each input character into uppercase, and prints out the result. When the input character is the new line character, the program terminates. The conversion process from lowercase to uppercase is done by the function ToUpper. Notice the use ofASCII literals in the function body to perform the actual conversion. Keep in mind that a character in single quotes (e.g., ‘A’) is evaluated as the ASCII value of that character. The expression ‘ a’ – ‘A ‘ is therefore the ASCII value of the character a minus the ASCII of A.
14.4.2 Problem 2: Pythagorean Triples
Now we’ll attempt a programming problem involving calculating all Pythagorean Triples less than a particular input value. A Pythagorean Triple is a set of three integer values a, b, and c that satisfy the property c2 = a2 +b2. In other words, a and b arc the lengths of the sides of a right triangle where c is the hypotenuse. For example, 3, 4, and 5 is a Pythagorean Triple. The problem here is to calculate all Triples a, b, and c where all are less than a limit provided by the user.
For this problem, we will attempt to find all Triples by brute force. That is, if the limit indicated by the user is max, we will check all combinations of three integers less than max to see if they satisfy the Triple property. In order to check all combinations, we will want to vary each sideA, sideB, and sideC from 1 to max. This implies the use of counter-controlled loops. More exactly, we will want to use a for loop to vary sidec, another to vary sideB, and another to vary sideA, each nested within the other. At the core of these loops, we will check to see if the property holds for the three values, and if so, we’ll print them out.
Now, in performing the Triple check, we will need to evaluate the following express10n.
(sidec * sideC == (sideA * sideA + sideB * sideB))
Because the square operation is a primitive operation for this problem-meaning it is required in several spots-we will encapsulate it into a function Squared that returns the square of its integer parameter. The preceding expression will be rewritten as follows. Notice that this code gives a clearer indication of what is being calculated.
(Squared(sideC) == Squared(sideA) + Squared(sideB))
The C program for this is provided in Figure 14.11. There are better ways to calculate Triples than with a brute-force technique of checking all combinations (Can you modify the code to run more efficiently?); the brute-force technique suits our purposes of demonstrating the use of functions.
• •
14.4 Problem Solving Using Functions
397
398 chapter 14
Functions
1 #include
2
3
4
5
6 7 8 9
int Squared (int x) ; int main()
int sideA; int sideB; int sideC; int maxC;
10
11
12
13
14
15
16
17
18
19″
20
21
22
23
24
25
26
27 {
28 return x * x; 29 }
printf( 11 Enter the maximum length of hypotenuse: 11 ) ; scanf(11%d11 , &maxC);
for (sideC = l; sideC <= maxC; sideC++) { for (sideB = 1; sideB <= maxC; sideB++)
}
for (sideA = l; sideA <= maxC; sideA++) {
if (Squared(sideC) == Squared(sideA) + Squared(sideB))
printf (11 %-d %d %d\n11 ,
sideA, sideB, sideC);
/* Calculate the square of a number*/ int Squared(int x)
Figure 14.11 A C program that calculates Pythagorean Triples
14.S Summar~
In this chapter, we introduced the concept of functions in C. The general notion of subprograms such as functions have been part of programming languages since the earliest languages. Functions are useful because they allow us to create new primitive building blocks that might be useful for a particular prograniming task (or for a variety of tasks). In a sense, they allow us to extend the native operations and constructs supported by the language.
The key notions that you should take away from this chapter are:
• Syntax offunctions in C. To use a function in C, we must declare the function using a function declaration (which we typically do at the beginning of our code) that indicates the function's name, the type of value the function returns, and the types and order of values the function expects as input5. A function's definition contains the actual code for the function. A function is invoked when a call to it is executed. A function call contains arguments-values that are to be passed to
the function as parameters.
• Implementation of C functions at the lower level. Part of the complexity associated with implementing functions is that in C, a function can be called from any other function in the source file (and even from functions in other object files). To assist in dealing with this, we adopt a general calling convention for calling one function from another. To assist with the fact that some functions might even call themselves, we base this calling convention on the run-time stack. The calling convention involves the caller passing the value of its arguments by pushing them onto the stack, then calling the callee. The arguments written by the caller become the parameters of the callee's activation record. The callee does its task and then pops its activation record off the stack, leaving behind its return value for the caller.
• Using functions when programming. It is conceivable to write all your programs without ever using functions, the result would be that your code would be hard to read, maintain, and extend and would probably be buggier than if your code used functions. Functions enable abstraction: we can write a function to perform a particular task, debug it, test it, and then use it within the program whereever it is needed.
14.1 What is the significance of the function main? Why must all programs contain this function?
14.2 Refer to the structure of an activation record for these questions.
a. What is the purpose of the dynamic link?
b. What is the purpose of the return address?
c. What is the purpose of the return value?
14.3 Refer to the C syntax of functions for these questions.
a. What is a function declaration? What is its purpose? b. What is a function prototype?
c. What is a function definition?
d. What are arguments?
e. What are parameters?
14.4 For each of the following items, identify whether the caller function or
the callee function performs the action.
a. Writing the parameters into the activation record.
b. Writing the return value.
c. Writing the dynamic link.
d. Modifying the value in RS to point within the callee function's
activation record.
Exercises
Exercises
399
400
chapter 14 Functions
14.5
What is the output of the following program? Explain. void MyFunc (int z) ;
int: main () {
int z = 2;
MyFunc (z); MyFunc(z);
void MyFunc(int z) (
printf(11 %d" z); z++;
What is the output of the following program? #include
14.6
int: Mult:iply(int: d, int: b); int d = 3;
int: main () {
int a, b, c; int e = 4;
a 1; b 2;
c Mult:iply(a, b); printf(“%d %d %d %d %d\n11 ,
int Multiply(int d, int b) {
int a·
‘ a2;
b = 3;
return (a * b);
}
a, b, c, d, e);
14.7
Following is the code for a C function named Bump.
int Bump (int x) {
14.8
What is the output of the following code? Explain why the function Swap behaves the way it does.
14.9
Are the parameters to a function placed on the stack before or after the JSR to that function? Why?
a. b.
c.
int a; a=X+1; return a;
Draw the activation record for Bump.
Write one of the following in each entry of the activation record to indicate what is stored there.
(1) Local variable
(2) Argument
(3) Address of an instruction
(4) Addressofdata
(5) Other
Some of the entries in the activation record for Bump are written by the function that calls Bump; some are written by Bump itself. Identify the entries written by Bump.
int main()
{
int X 1·
‘
inty 2; SwapIx, y);
printf(11x = %d
y %-d\n”
I
x, y);
void Swap(int y, int x) {
int temp
temp= x; X y;
y = temp;
}
Exercises 401
402
chapter 14 Functions
14.10 A C program containing the function f o o d has been compiled into LC-3
assembly language. The partial translation of the function into LC-3 is:
food:
ADD R6,
R6, #-2 R6, #0 R6, #-1 R6, #0 R6, #-1 R6, #-4
a. b.
How many local variables does this function have?
Say this function takes two integer parameters x and y. Generate the code to evaluate the expression x + y.
STRR71 ADD R6, STR RS, ADD RS, ADD R6,
14.11 Following is the code for a C function named U n i t . int main()
inta 1; intb 2;
a Init (a) i b Unit (b);
printf{ 11 a = %d b
int Init (int x) {
int y = 2; return y + x;
int Unit (int xi {
int Zi
return z + x;
%d\n°, a, b);
}
a. What is the output of this program?
b. WhatdeterminesthevalueoflocalvariablezwhenfunctionUnit
starts execution?
14.12 Modify the example in Figure 14.10 to also convert each character to lowercase. The new program should print out both the lower- and uppercase versions of each input character.
14.13 Write a function to print out an integer value in base 4 (using only the digits 0, 1, 2, 3). Use this function to write a program that reads two integers from the keyboard and displays both numbers and their sum in base 4 on the screen.
14.14 Write a function that returns a 1 if the first integer input parameter is evenly divisible by the second. Using this function, write a program to find the smallest number that is evenly divisible by all integers less than 10.
14.15 The following C program is compiled into LC-3 machine language and loaded into address x3000 before execution. Not counting the JSRs to library routines for I/0, the object code contains three JSRs (one to function f, one tog, and one to h). Suppose the addresses of the three JSR instructions are x3102, x3301, and x3304. And suppose the user provides 4 s 6 as input values. Draw a picture of the run-time stack, providing the contents of locations, if possible, when the program is about to return from function f. Assume the base of the run-time stack is location xEFFF.
#include
int f (int x, int y, int z); int g{int arg);
int h(int argl, int arg2) ;
int main() (
inta,b,c·
‘
printf(“Type three numbers: “);
scanf(”%d %d %d11
1
&a, &b, &c);
printf(”%d11 f(a, b c)); 11
int f (int x, int y, int z) int xl;
xl-g(x);
return h(y, z) * xl;
int g (int arg) (
return arg * arg;
int h(int argl, int arg2) return argl I arg2;
Exercises 403
i
~
Ii
”
!I ii i !1’
‘
404
chapter 14 Functions
14.16
Referring once again to the machine-busy example from previous chapters, remember that we represent the busyness of a set of 16 machines with a bit pattern. Recall that a Oin a particular bit position indicates the corresponding machine is busy and a I in that position indicates that machine is idle.
a. Write a function to count the number of busy machines for a given busyness pattern. The input to this function will be a bit pattern (which can be represented by an integer variable), and the output will be an integer corresponding to the number of busy machines.
b. Write a function to take two busyness patterns and determine which machines have changed state, that is, gone from busy to idle, or idle to busy. The output of this function is simply another bit pattern with a 1 in each position corresponding to a machine that has changed its state.
c. Write a program that reads a sequence of 10 busyness patterns from the keyboard and determines the average number of busy machines and the average number of machines that change state from one pattern to the next. The user signals the end of busyness patterns by entering a pattern of all Is (all machines idle). Use the functions you developed for parts I and 2 to write your program.
a. Write a C function that mimics the behavior of a 4-to-1 multiplexor. See Figure 3.13 for a description of a 4-to-1 MUX.
b. Write a C function that mimics the behavior of the LC-3 ALU.
Notice that on a telephone keypad, the keys labeled 2, 3, 4, …, 9 also have letters associated with them. For example, the key labeled 2 corresponds to the letters A, B, and C. Write a program that will map a seven-digit telephone number into all possible character sequences that the phone number can represent. For this program, use a function that performs the mapping between digits and characters. The digits I and 0 map to nothing.
14.17
14.18
14.19 The following C program uses a combination of global variables and local variables with different scope. What is the output?
#include
int subl(int fluff); int main ()
/* Global variable*/
{
intt~ 2; int z· z~t·
z = z’+ ‘~,.
printf (11 A: The variable z equals %d\n11
,
zt; t 3;
int t = 4;
z=t;
z=z+1;
printf( 11 B: The variable z equals %d\n11 , z);
z = subl(z);
z == z + 1;
pr.intf( 11 C: The variable z equals %d\n”, z};
zt; zz+l;
printf(11D: The variable z equals %d\n11 ,
z);
}
int subl(int fluff)
{
inti;
i=t;
return (fluff+ i);
z);
Exercises 405
Testing and Debugging
15.1 Introduction
In December 1999, NASA mission controllers lost contact with the Mars Polar Lander as it approached the Martian surface. The Mars Polar Lander was on a mission to study the southern polar region of the Red Planet. Contact was never reestablished, and NASA announced that the spacecraft most probably crashed onto the planet’s surface during the landing process. After evaluating the situation, investigators concluded that the likely cause was faulty control software that prematurely caused the on-board engines to shut down when the probe was 40 meters above the surface rather than when the probe had actually landed. The physical complexities of sending probes into space is astounding, and the software systems that control these spacecraft are no less complex. Software is as integral to a system as any mechanical or electrical subsystem, and all the more difficult to make correct because it is “invisible.” It cannot be visually observed as easily as, say, a propulsion system or landing system.
Software is everywhere today. It is in your cell phone, in your automobile- even the text of this book was processed by numerous lines of software before appearing in front of you on good old-fashioned printed pages. Because soft- ware plays a vital and critical part in our world, it is important that this software behave correctly according to specification. Designing working programs is not automatic. Programs are not correct by construction. That is, just because a pro- gram is written does not mean that it functions correctly. We must test and debug it as thoroughly as possible before we can deem it to be complete.
chapter
15
408 chapter 15 Testing and Debugging
Programmers often spend more time debugging their programs than they spend writing them. A general observation made by experts is that an experi- enced programmer spends as much time debugging code as he/she does writing it. Because of this inseparable relation between writing code and testing and debugging it, we introduce you to some basic concepts in testing and debugging in this chapter.
Testing is the process ofexposing bugs, and debugging is the process offixing them. Testing a piece of code involves subjecting it to as many input conditions as possible, in order to stress the software into revealing its bugs. For example, in testing the function Toupper from the previous chapter (recall that this function returns the uppercase version of an alphabetic character passed as a parameter), we might want to pass every possible ASCII value as an input parameter and observe the function’s output in order to determine if the function behaves according to specification. If the function produces incorrect output for a particular input, then we’ve discovered a bug. It is better to find the bug while the code is still in development than to have an unsuspecting user stumble on the bug inadvertently. It would have been better for the NASA software engineers to find the bug in the Mars Polar Lander on the surface of the earth rather than encounter it 40 meters above the surface of Mars.
Using information about a program and its execution, a programmer can apply common sense to deduce where things are going awry. Debugging a program is a bit like solving a puzzle. Like a detective at a crime scene, a programmer must examine the available clues in order to track down the source of the problem. Debugging code is significantly easier if you know how to gather information about the bug-such as the value of key variables during the execution of the program-in a systematic way.
In this chapter, we describe several techniques you can use to find and fix bugs within a program. We first describe some broad categories of errors that can creep into programs. We then describe testing methods for quickly finding these errors. We finally describe some debugging techniques for isolating and repairing these errors, and we provide some defensive programming techniques to minimize the bugs in the code you write.
15.2 TqpesofErrors
To better understand how to find and fix errors in programs, it is useful to get a sense of the types of errors that can creep into the programs we write. There are three broad categories of errors that you are likely to encounter in your code. Syntactic errors are the easiest to deal with because they are caught by the com- piler. The compiler notifies us of such errors when it attempts to translate the source code into machine code, often pointing out exactly in which line the error occurred. Semantic errors, on the other hand, are problems that can often be very difficult to repair. They occur when the program is syntactically correct but does not behave exactly as we expected. Both syntactic and semantic errors are generally typographic errors: these occur when we type something we did not mean lo type. Algorithmic errors are errors in which our approach to solving a
1 #include
2
3 int main()
4
5 int i 6 intj; 7
8
9
10
11
12
Figure 15.1
printf(“%d X 7 = %d\n”, i, j);
This program contains a syntactic error
for (i = O; i <= 10; i++) { j = i * 7·'
problem is wrong. They arc often hard to detect and, once detected, can be very hard to fix.
15.2.1 Syntactic Errors
In C, syntactic errors (or syntax errors or parse errors) are always caught by the compiler. These occur when we ask the compiler to translate code that does not conform lo the C specification. For instance, the code listed in Figure 15.1 contains a syntax error, which the compiler will flag when the code is compiled.
Thedeclarationforthevariablei ismissingasemicolon.AsanoviceCpro- grammer, missing semicolons and variable declarations will account for a good number of the syntax errors you will encounter. The good news is that these types of errors are easy to find, because the compiler detects them, and are easy to fix, because the compiler indicates where they occur. The real problems start once the syntax errors have been fixed and the harder semantic and algorithmic errors remam.
15.2.2 Semantic Errors
Semantic errors are similar to syntactic errors. They occur for the same reason: Our minds and our fingers are not completely coordinated when typing in a prograni. Semantic errors do not involve incorrect syntax; therefore, the program gets trans- lated and we are able to execute it. It is not until we analyze the output that we dis- cover that the program is not performing as expected. Figure 15.2 lists an example of the same program as Figure 15.1 with a simple semantic error (the syntax error is fixed). The program should print out a multiplication table for the number 7.
Here, a single execution of the program reveals the problem. Only one entry of the multiplication table is printed. You should be able to deduce, given your knowledge of the C programming language, why this program behaves incor- rectly. Why is 11 x 7 = 70 printed out? This program demonstrates something called a control flow error. Here, the program's control flow, or the order in which statements are executed, is different than we intended.
The code listed in Figure 15.3 contains a common, but tricky semantic error involving local variables. This example is similar to the factorial program we discussed in Section 14.2.
15.2 Types of Errors 409
410
chapter 15 Testing and Debugging
•
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
27
printf (11 Input a number: 11 ) ;
1 #include
3 int main()
4{
5 6 7 8 9
inti; intj;
for (i = O; i <= 10; i++) j=i*7;
i, j); Figure 15.2 A program with a semantic error
This program calculates the sum of all integers less than or equal to the number input from the keyboard (i.e., it calculates 1 + 2 + 3 + ... + n). Try executing this program and you will notice that the output is not what you would expect. Why doesn't it work properly? Hint: Draw out the run-time stack for an execution o f this program.
Semantic errors are particularly troubling because they often go undetected by both the compiler and the programmer until a particular set of inputs triggers
10
printf("%d x 7 = %d\n11 ,
11 }
1 ~include
2
3 int AllSum(int n); 4
5 int main()
6{
7
8
9
int in; int sum;
/* Input value
/* Value of 1+2+3+..+n
*/ */
scanf(11 %d11 ,
&in);
sum =AllSum(in);
printf(“The AllSum of %d is %d\n11 ,
int AllSum(int n) {
int result; int i;
for (i 1; i <=n; i++) result = result + i;
in, sum);
return result;
Figure 15.3 A program with a bug involving local variables
/* Result to be returned */
/* Iteration count
/* This calculates sum
*/ */
*I
/* Return to caller
the error. Refer to the AllSum program in Figure 15.3, but repair the previous semantic error and notice that if the value passed to A l l Sum is less than or equal to Oor too large, then AllSum may return an erroneous result because it has exceeded the range of the integer variable r e s u l t . Fix the previous bug, compile the program, and input a number smaller than 1 and you will notice another bug.
Some errors are caught during execution because an illegal action is per- formed by the program. Almost all computer systems have safeguards that prevent a program from performing actions that might affect other unrelated programs. For instance, it is undesirable for a user's program to modify the memory that stores the operating system or to write a control register that might affect other programs, such as a control register that causes the computer to shut down. When such an illegal action is performed by a program, the operating system termi- nates its execution and prints out a run-time error message. Modify the scanf statement from the A l l s u m example to the following:
in);
In this case, the ampersand character, &, as we shall see in Chapter 16, is a special operator in the C language. Omitting it here causes a run-time error because the program has attempted to modify a memory location to which it does not have access. We will look at this example and the reasons for the error in more detail in later chapters.
15.2.3 Algorithmic Errors
Algorithmic errors are the result of an incorrect program design. That is, the program itself behaves exactly as we designed, but the design itself wa~ flawed. These types of errors can be hidden; they may not appear until many trials of the program have been run. Even when they are detected and isolated, they can be very hard to repair. The good news is that these types of errors can often be reduced and even eliminated by proper planning during the design pha~e, before any code is written.
An example of a program with a simple algorithmic flaw is provided in Figure 15.4. This code takes as input the number of a calendar year and determines if that year is a leap year or not.
At first glance, this code appears to be correct. Leap years do occur every four years. However they are skipped at the tum of every century, except every fourth century (i.e., the year 2000 was a leap year, but 2100, 2200, and 2300 will not be). The code works for almost all years, except those falling into these exceptional cases. We categorize this as an algorithmic error, or design flaw.
Another example of an algorithmic error also involving dates is the infamous Year 2000 computer bug, or Y2K bug. Many computer programs minimize the amount of memory required to store dates. They use enough bits to store only the last two digits of the year, and no more. Thus, the year 2000 is indistinguishable from the year 1900 (or 1800 or 2100 for that matter). This presented a problem during the recent century crossover on December 31, 1999. Say, for example, you had checked out a book from the university library in late 1999 and it wa~ due back sometime in early 2000. If the library's computer system suffered from the
scanf(11 %d11 ,
15.2 Types of Errors 411
412
chapter 15 Testing and Debugging
1 #include
2
3 int main()
4{ 5
6
7
8
9
10 11 12 13 14
Y2K bug, you would have gotten an overdue notice in the mail with some hefty fines listed on it. As a consequence, a lot of money and effort were devoted to tracking down Y2K-related bugs before January l, 2000 rolled around.
lS.3 Testing
There is an adage among seasoned programmers that any line of code that is untested is probably buggy. Good testing techniques are crucial to writing good software. What is testing? With testing, we basically put the software through trials where input patterns are applied (in order to mimic what the software might see during real operation) and the output of the program is checked for correctness. Real-world software might undergo millions of trials before it is released.
In an ideal world, we could test a program by examining its operation under every possible input condition. But for a program that is anything more than trivial, testing for every input combination is impossible. For example, if we wanted to test a program that finds prime numbers between integers A and B, where A and B are 32-bit input values, there are (232)2 possible input combinations. Even if we could run 1 million trials in I second, it would still take half a million years to completely test the program. Clearly, testing each input combination is not an option. So which input combinations do we test with? We could randomly pick inputs in hopes that some of those random patterns will expose the program’s bugs. Software engineers typically rely on more systematic ways of testing their code. In particular, black-box testing is used to check if a program meets its specifications, and white-box testing targets various facets of the program’s implementation in
order to provide some assurance that every line of code is tested.
15.3.1 Black-Box Testing
With black-box testing, we examine if the program meets its input and output specifications, disregarding the internals of the program. That is, with black-box testing, we are concerned with what the program does and not how it does it. For
int year;
printf(“Inputayear (i.e., 1996): “);
scanf(11%d11 ,
&year);
if (year% 4 == 0)
printf (“This year is a leap year” \n);
else
printf(“This year is not a leap year”\n);
}
Figure 15.4 This program to determine leap years has an algorithmic bug
example, a black-box test of the program A l l Sum in Figure 15.3 might involve running the program, typing an input number, and comparing the resulting output to what you calculated by hand. lf the two do not match, then either the program contains a bug or your arithmetic skills are shoddy. We might continue attempting trials until we are reasonably confident that the program is functional.
For testing larger programs, the testing process is automated in order to run more tests per unit time. That is, we construct another program to automatically run the original program, provide some random inputs, check that the output meets specifications, and repeat. With such a process, we can clearly run many more trials than we could if a person performed each trial.
In order lo automate the black-box process, however, we need a way to automatically test whether the program’s output was correct or incorrect. Here, we might need to construct a checker program that is different than the original program but performs a similar computation. If the original and checker programs had the same bug, it would go undetected by the black-box testing process. For this reason, black-box testers who write checker programs are often not permitted to see the code within the black box they are testing so that we get a truly independent version of the checker.
15.3.2 White-Box Testing
For larger software systems, black-box testing is not enough. With black-box testing, it is not possible to know which lines of code have been tested and which have not, and therefore, according to the adage stated previously, all are presumed to be buggy. Black-box testing is sometimes difficult when the input or output specification of a program is not concrete. For example, black-box testing of an audio player (such as an MP3 player) might be difficult because of the inexact nature of the output. Also, black-box testing can only start once the software is complete-the software must compile and must meet some part of the specification in order to be tested.
Software engineers supplement black-box testing with white-box tests. White-box tests isolate various internal components of the software, and test whether the components conform to their intended design. For example, testing to sec that each function performs correctly according to the design is a white-box test. How we divide a program into functions is part of its implementation and not its specification. We can apply the same type of testing to loops and other constructs within a function.
How might a white-box test be constructed? For many tests, we might need to modify the code itself. For example, in order to sec whether a function is working correctly, we might add extra code to call the function a few extra times with different inputs and check the outputs. We might add extra p r i n t f statements to the code with which we can observe values of internal variables to sec if things are working as expected. Once the code is complete and ready for release, these p r i n t f statements can be removed.
A common white-box testing technique is the use of error-detecting code strategically placed within a program. This code might check for conditions that indicate that the program is not working correctly. When an incorrect situation is
15.3 Testing 413
414
chapter 15 Testing and Debugging
detected, the code prints out a warning message, displays some relevant informa- tion about the situation, or causes the program to premature!y terminate. Since this error-detecting code asserts that certain conditions hold during program execution, we generally call these checks assertions.
For example, assertions can be used to check whether a function returns a value within an expected range. If the return value is out of this range, an error message is displayed. In the following example, we are checking whether the calculation performed by the function IncomeTax is within reasonable bounds. As you can deduce from this code fragment, this function calculates the income tax based on a particular income provided a~ a parameter to it. We do not pay more tax than we collect in income (fortunately!), and we never pay a negative tax. Here if the calculation within IncomeTax is incorrect, a warning message will be displayed by the assertion code.
tax= IncomeTax(income);
if (tax< O II tax> income)
printf (11 Error in function IncomeTax!\n11 ) ;
A thorough testing methodology requires the use of both black-box and white- box tests. It is important to realize that white-box tests alone do not cover the complete functionality of the software—even if all white-box tests pass, there might be a portion of the specification that is missing. Similarly, black-box tests alone do not guarantee that every line of code is tested.
15.4 Debugging
Once a bug is found, we start the process of repairing it, which can often be more tricky than finding it. Debugging an error requires full use of our reasoning skills: We observe a symptom of the error, such as bad output, and we might even have some other information, such as the place in the code where the error occurred, and from this limited information, we will need to use deduction to isolate the source of the error. The key to effective debugging is being able to quickly gather relevant information that will lead to identifying the bug, similar to the way a detective might gather evidence at a crime scene or the way a physician might perform a series of tests in order to diagnose a sick patient’s illness.
There are a number of ways you can gather more information in order to diagnose a bug, ranging from ad hoc techniques that are quick and dirty to more systematic techniques that involve the use of software debugging tools.
15.4.1 Ad Hoc Techniques
The simplest thing to do once you realize that there is a problem with your program is to visually inspect the source code. Sometimes the nature of the failure tips you off to the region of the code where the bug is likely to exist. This technique is fine if the region of source code is small and you are very familiar with the code.
Another simple technique is to insert statements within the code to print out information during execution. You might print out, using p r i n t f statements, the values of important variables that you think will be useful in finding the bug. You can also add p r i n t f statements at various points within your code to see if the control flow of the program is working correctly. For example, if you wanted to quickly determine if a counter-controlled loop is iterating for the correct number ofiterations,youcouldplaceaprintf statementwithintheloopbody.Forsimple programs, such ad hoc techniques are easy and reasonable to use. Large programs with intricate bugs require the use of more heavy-duty techniques.
15.4.2 Source-Level Debuggers
Often ad hoc techniques cannot provide enough information to uncover the source of a bug. In these cases, programmers often tum to a source-level debugger to isolate a bug. A source-level debugger is a tool that allows a program to be executed in a controlled environment, where all aspects of the execution of the program can be controlled and examined by the programmer. For example, a debugger can allow us to execute the program one statement at a time and examine the values of variables (and memory locations and registers, if we so choose) along the way. Source-level debuggers are similar to the LC-3 debugger that we descpbed in Chapter 6, except that a source-level debugger operates in relation to high-level source code rather than LC-3 machine instructions.
For a source-level debugger to be used on a program, the program must be compiled such that the compiler augments the executable image with enough additional information for the debugger to function properly. Among other things, the debugger will need information from the compilation process in order to map every machine language instruction to its corresponding statement in the high- level source program. The debugger also needs information about variable names and their locations in memory (i.e., the symbol table). This is required so that a
programmer can examine the value of any variable within the program using its name in the source code.
There are many source-level debuggers available, each of which has its own user interface. Different debuggers are available for UNIX and Windows, each with its own flavor of operation. For example, gdb is a free source-level debugger available on most UNIX-based platforms. All debuggers support a core set of necessary operations required to probe a program’s execution, many of which are similar to the debugging features of the LC-3 debugger. So rather than describe the user interface for any one particular debugger, in this section we wi II describe the core set of operations that are universal to any debugger.
The core debugger commands fall into two categories: those that let you control the execution of the program and those that let you examine the value of variables and memory, etc. during the execution.
Breakpoints
Breakpoints allow us to specify points during the execution of a program when the program should be temporarily stopped so that we can examine or modify the
15.4 Debugging 415
416
chapter 15 Testing and Debugging
state of the program. This is useful because it helps us examine the program’s execution in the region of the code where the bug occurs.
For example, we can add a breakpoint at a particular line in the source code or at a particular function. When execution reaches that line, program execution is frozen in time, and we can examine everything about that program at that particular instance. How a breakpoint is added is specific to the user interface of the debugger. Some allow breakpoints to be added by clicking on a line of code. Others require that the breakpoint be added by specifying the line number through a command prompt.
Sometimes it is useful to stop at a line only if a certain condition is true. Such conditional breakpoints arc useful for isolating specific situations in which we suspect buggy behavior. For example, if we suspect that the function P e r f o r m c a l c u l a t i o n works incorrectly when its input parameter is 16, then we might want to add a breakpoint that stops execution only when x is equal to 16 in the following code:
for Ix – 0; x < 100; x++) PerformCalculation{x);
Alternatively, we can set a watchpoint to stop the program at any point where a particular condition is true. For example, we can use a watchpoint to stop execution whenever the variable Last I tern is equal to 4. This will cause the debugger to stop execution at any statement that causes Lastitem to equal 4. Unlike breakpoints, watchpoints are not associated with any single line of the code but apply to every line.
Single-Stepping
Once the debugger reaches a breakpoint (or watchpoint), it temporarily suspends program execution and awaits our next command. At this point we can examine program state, such as values of variables, or we can continue with execution.
It is often useful to proceed from a breakpoint one statement at time-a process referred to as single-stepping. The LC-3 debugger ha5 a command that executes a single LC-3 instruction and similarly a source-level debugger that allows execution to proceed one statement at a time. The single-step command executes the current source line and then suspends the program again. Most debug- gers will also display the source code in a separate window so we can monitor where the program has currently been suspended. Single-stepping through a pro- gram is very useful, particularly when executing the region of a program where the bug is suspected to exist. We can set a breakpoint near the suspected region and then check the values of variables as we single-step through the code.
A common use of single-stepping is to verify that the control flow of the program does what we expect. We can single-step through a loop to verify that it performs the correct number of iterations or we can single-step through an i f - e l s e to verify that we have programmed the condition correctly.
Variations of single-stepping exist that allow us to skip over functions, or to skip to the last iteration of a loop. These variations are useful for skipping over
15.5 Programming for Correctness 417 code that we do not suspect to contain errors but are in the execution path between
a breakpoint and the error itself.
Displaying Values
The art of debugging is about gathering the information required to logically deduce the source of the error. The debugger is the tool of choice for gathering information when debugging large programs. While execution is suspended at a breakpoint, we can gather information about the bug by examining the values of variables related to the suspected bug. Generally speaking, we can examine all execution states of the program at the breakpoint. We can examine the values of variables, memory, the stack, and even the registers. How this is done is debugger specific. Some debuggers allow you to use the mouse to point to a variable in the source code window, causing a pop-up window to display the variable's current value. Some debuggers require you to type in a command indicating the name of the variable you want to examine.
We encourage you to familiarize yourself with a source-level debugger. At the end of this chapter, we provide several problems that you can use to gain some experience with this useful debugging tool.
lS.S ProgrammingforCorrectness
Knowing how to test and debug your code is a prerequisite for being a good pro- grammer. Great programmers know how to avoid many error-causing situations in the first place. Poor programming practices cause bugs. Being aware of some defensive programming techniques can help reduce the amount of time required to get a piece of code up and running. The battle against bugs starts before any line of code is written. Here, we provide three general methods for catching errors even before they become errors.
15.5.1 Nailing Down the Specification
Many bugs arise from poor or incomplete program specifications. Specifications sometimes do not cover all possible operating scenarios, and thus they leave some conditions open for interpretation by the programmer. For example, recall the factorial example from Chapter 14: Figure 14.2 is a program that calculates the f a c t o r i a l o f a n u m b e r t y p e d a t t h e k e y b o a r d . Y ou c a n i m a g i n e t h a t t h e s p e c i f i c a t i o n for the program might have been "Write a program to take an integer value from the keyboard and calculate its factorial." As such, the specification is incomplete. What if the user enters a negative number? Or zero? What if the user enters a number that is too large and results in an overflow? In these cases, the code as written will not perform correctly, and it is therefore buggy. To fix this, we need to modify the specification of the program to allow the program to indicate an error if the input is less than or equal to zero, or if the input is such that n! > 231 , implying n must be less than or equal to 3l. In the code that follows we have added an input range check to the Factorial function from Chapter 14. Now
418
chapter 15 Testing and Debugging
the function prints a warning message and returns a -1 if its input parameter is out of the correct operating range.
1 int Factorial(int n)
2{ 3
4
5
6 7 8 9
10
11
12
13
14
15
16 }
inti;
int result l;
/* Iteration count */
/* Check for legal parameter values*/ if(n<111n>31){
return result;
/* Return to caller
*I
15.5.2 Modular Design
/* Initialized result
*/
printf (“Bad input. Input must be >- 1 and <- 31. \n");
return -1; }
for {i = 1; i <= n; i++) /* Calculates factorial */ result= result* i·
Functions are useful for extending the functionality ofthe programming language. With functions we can add new operations and constmcts that are helpful for a particular programming task. In this manner, functions enable us to write programs in a modular fashion.
Once a function is complete, we can test it independently in isolation (i.e., as a white-box test) and determine that it is working as we expect. Since a typical function performs a smaller task than the complete program, itis easier to test than the entire program. Once we have tested and debugged each function in isolation, we will have an easier chance getting the program to work when everything is integrated.
This modular design concept of building a program out of simple, pretested, working components is a fundamental concept in systems design. In subsequent chapters we will introduce the concept of a library. A library is a collection of pretested components that all programmers can use in writing their code. Modem programming practices are heavily oriented around the use of libraries because of the benefits inherent to modular design. We design not only software, but circuits, hardware, and various other layers of the computing system using a similar modular design philosophy.
15.5.3 Defensive Programming
All seasoned programmers have techniques to prevent bugs from creeping into their code. They construct their code in a such a way that those errors that they
suspect might affect the program are eliminated by design. That is, they program defensively. We provide a short list ofgeneral defensive programming techniques that you should adopt to avoid problems with the programs you write.
•
•
•
•
•
Comment your code. Writing comments makes you think about the code you've written. Code documentation is not only a way to inform others about how your code works, hut also is a process that makes you reflect on and reconsider your code. During this process you might discover that you forgot a special case or operating condition that will ultimately break your code.
Adopt a consistent coding style. For instance, aligning opening and closing braces will let you identify simple semantic errors associated with missing braces. Along these lines, also he consistent in variable naming. The name of a variable should convey some meaningful information about the value the variable contains.
Avoid assumptions. It is tempting to make simple, innocent assumptions when writing code, hut these can ultimately lead to broken code. For example, in writing a function, we might assume that the input parameter will always he within a certain range. If this assumption is not grounded in the program's specification, then the possibility for an error has been introduced. Write code that is free of such assumptions----or at least use assertions and spot checks to indicate when the assumptions do not hold.
Avoid global variables. While some experienced programmers rely heavily on global variables, niany software engineers advocate avoiding them when- ever possible. Global variables can make some programming tasks easier. However, they often make code more difficult to understand, and extend, and when a hug is detected, harder to analyze.
Rely on the compiler. Most good compilers have an option to carefully check your program for suspicious code (for example, an uninitialized variable) or commonly misapplied code constructs (for example, using the assignment operator = instead of the equality operator ==). While these checks are not thorough, they do help identify some commonly made programming mis- takes. If you are use the gee compiler, use gee -Wal 1 to enable all warning messages from the compiler.
The defensive techniques mentioned here arc particular to the programming concepts we've already discussed. In subsequent chapters, after we introduce new programming concepts, we also discuss how to use defensive techniques when writing programs that use them.
lS.6 Summar~
In this chapter, we presented methodologies for finding and fixing hugs within your code. Modem systems are increasingly reliant on software, and modem software is often very complex. In order to prevent software hugs from often rendering our cell phones unusable or from occasionally causing airplanes to
15.6 Summary
419
420
chapter 15 Testing and Debugging
crash, it is important that software tightly conform to its specifications. The key concepts that we covered in this chapter are:
• Testing. Finding bugs in code is not easy, particularly when the program is large. Software engineers use systematic testing to find errors in software. Black-box testing is done to validate that the behavior of a program conforms to specification. White-box testing targets the structure of a program and provides some assurance that every line of code has undergone some level of testing.
• Debugging. Debugging an error requires the ability to take the available information and deduce the source of the error. While ad hoc techniques can provide us with a little additional information about the bug, the source-level debugger is the software engineering tool of choice for most debugging tasks. Source-level debuggers allow a programmer to execute a program in a controlled environment and examine various values and states within the program during execution.
• Programming for correctness. Experienced programmers try to avoid bugs even before the first line of code is written. Often, the specification of the program is the source of bugs, and nailing down loose ends will help eliminate bugs after the code has been written. Modular design involves writing a larger program out of simple pretested functions and helps reduce the difficulty in testing a large program. Following a defensive programming style helps reduce situations that lead to buggy code.
15.1 The following programs each have a single error that prevents them from operating as specified. With as few changes as possible, correct the programs. They all should output the sum of the integers from
1 to 10, inclusive.
a. #include
{
int i = l;
int sum= O;
while (i < 11) sum= sum+ i; ++i;
prin.tf (n%-d\n11 ,
}
b. #include
{
inti;
int sum-= O;
sum);
}
for (i = O; i >= 10; ++i) sum= sum+ i;
printf (”%d\n”,
c. #include
{
int i = Oi
int sum = O;
sum);
while (i <-= 11) sum= sum+ i++;
p.rintf ( 11 %d\n 11
d. #include
{
sum);
lnt i = O; int sum= O;
for (i ~ O; i <~ 10;) sum= sum+ ++i;
printf( 11 %d\n11 , sum);
1
Exercises 421
I
15.2 The following program fragments have syntax errors and therefore will not compile. Assume that all variables have been properly declared. Fix the errors so that the fragments will not cause compiler errors.
a. i = o; j = 0;
while (i < 5) {
jj+1; i j>>1
b. if (cont — 0) a2; b3;
else
a -2;
b -3;
C. #define LIMIT 5;
if’ (LIMIT) printf(11True11 );
else
printf(11False11 );
15.3 The following C code was written to find the minimum of a set of positive integers that a user enters from the keyboard. The user signifies the end of the set by entering the value -1. Once all the numbers have been entered and processed, the program outputs the minimum. However, the code contains an error. Identify and suggest ways to fix the error. Use a source-level debugger, if needed, to find it.
#include
{
int smallestNurnber O; int nextinput;
/* Get the first input number*/ scanf( 11 %d11 , &nextinput);
/* Keep reading inputs until user enters -1 */ while (nextinput != -1) {
422 chapter 15 Testing and Debugging
if (nextinput < smallestNumber) smallestNumber = nextinput;
scanf(11 %d11
printf("The smallest number is %d\n11 ,
,
&nextInput) ;
smallestNumber);
15.4 The following program reads in a line of characters from the keyboard and echoes only the alphabetic, numeric, and space characters. For example,iftheinputwere"Let's meet at 6,oopm.",theoutput should be: "Lets meet at 600pm". The program does not work as specified. Fix it.
#include
{
char echo ‘0 ‘; while (echo !- ‘\n’)
}
scanf(11 %c11
if ((echo > ‘a’ IIecho < 'z') &&
(echo>‘A’ !lecho<'Z')) printf(11 %c1', echo);
,
&echo);
Exercises 423
424 chapter 15 Testing and Debugging
15.5
Use a source-level debugger to monitor the execution of the following code:
#include
int IsDivisibleBy(int dividend, int divisor);
int main() {
int i; /* Iteration variable
int j; /* Iteration variable */ int f; I* The number of factors of a number*/
for (i = 2; i < 1000; i++) {
f = O;
for (j = 2; j < i; j++I {
i f (IsDivisibleBy (i, j I I f++;
printf(!!The number %-d has %d factors\n11 ,
int IsDivisibleBy(int dividend, int divisor)
i, f);
a. b. c.
Set a breakpoint at the beginning of function IsDivisibleBy and examine the parameter values for the first 10 calls. What are they?
What is the value off after the inner for loop ends and the value of i equals 660?
Can this program be written more efficient!y? Hint: Monitor the value of the arguments when the return value of IsDivisibleBy is 1.
if (dividend% divisor 0) return l;
else
return O;
*/
Exercises 425 15.6 Using a source-level debugger, detennine for what values of parameters
the function Mystery returns a zero. #include
int Mystery(int a, int b, int c); int main()
{
int .i ;
int j ;
int k;
int sum = O;
/* Iteration variable */
/* Iteration variable
I* Iteration variable */ I* running sum of Mystery */
for (i = 100; i > 0; i–) { for (j =1; j < i; j++) {
for (k = j; k < 100; k++)
sum= s:.1m + Mystery(i, j, k);
int Mystery(int a, int b, int c)
{
int out;
out= 3*a*a + 7*a - S*b*b + 4*b + S*c return out;
*/
426
chapter 15 Testing and Debugging
15.7 The following program manages flight reservations for a small airline that has only one plane that has SEATS number of seats for passengers. This program processes ticket requests from the airline's website. The command R requests a reservation. If there is a seat available, the reservation is approved. If there are no scats, the reservation is denied. Subsequently, a pa~senger with a reservation can purchase a ticket using the P command. This means that for every P command, there must be a preceding R command; however, not every R will materialize into a purchased ticket. The program ends when the X command is entered. Following is the program, but it contains serious design errors. Identify the errors. Propose and implement a correct solution.
#include
int main() {
int seatsA vailable char request= ‘O’;
SEATS;
while (request != ‘X’) scanf(11%-c11 , &request);
if (request == ‘R’) ( if (seatsA vailable)
printf ( 11 Reservation Approved! \n”} ; else
printf(“Sorry, flight fully booked. \n”);
if (request== ‘P’)
seatsA vailable–;
printf( 11 Ticket purchased!\n11 ) ;
printf( 11 Done! %d seats not sold\n 11
1
seatsA vailable);
chapter
16 Pointers and nrraqs
16.l Introduction
In this chapter, we introduce (actually, reintroduce) two simple but powerful programming constructs: pointers and arrays. We used pointers and arrays when writing LC-3 assembly code. Now, we examine them in the context of C.
A pointer is simply the address of a memory object, such as a variable. With pointers, we can indirectly access these objects, which provides some very useful capabilities. For example, with pointers, we can create functions that modify the arguments passed by the caller. With pointers, we can create sophisticated data organizations that grow and shrink (like the run-time stack) during a program’s execution.
An array is a list of data arranged sequentially in memory. For example, in a few of the LC-3 examples from the first half of the book, we represented a file of characters as a sequence of characters arranged sequentially in memory. This sequential arrangement ofcharacters is known as an array ofcharacters. To access a particular item in an array, we need to specify which element we want. As we’ll see, an expression like a [4 J will access the fifth element in the array named a – i t is the fifth element because we start numbering the array at element 0. Arrays are useful because they allow us to conveniently process groups of data such as vectors, matrices, lists, and character strings, which are naturally representative of certain objects in the real world.
428
chapter 16 Pointers and Arrays
16.2 Pointers
We begin our discussion of pointers with a classic example of their utility. In the C program in Figure 16.1, the function Swap is designed to switch the value ofits two arguments. The function swap is called from main with the arguments valueA, which in this case equals 3, and valueB, which equals 4. Once swap returns control to main, we expect valueA and valueB to have their values swapped. However, compile and execute the code and you will notice that the arguments passed to swap remain the same.
Let’s examine the run-time stack during the execution of swap to analyze why. Figure 16.2 shows the state of the run-time stack just prior to the completion of the function, just after the statement on line 25 has executed but before control returns to function main. Notice that the function swap has modified the local copies of the parameters firstVal and secondVal within its own activation record. When swap finishes and control returns to main, these modified values are lost when the activation record for swap is popped off the stack. The values from main’s perspective have not been swapped. We have a buggy program.
In C, arguments are always passed from the caller function to the callee by value. C evaluates each argument that appears in a function call as an expression and pushes the value of the expression onto the run-time stack in order to pass them to the function being called. For swap to modify the arguments that the caller
1 #include
2
3 void Swap(int 4
5 int main() 6{
7
8
9
firstV al,
3; 4;
int secondVal);
10
11
12
13
14
15
16
17 } 18
printf (“Before Swap “) ;
printf(11valueA = %d and valueB Swap(valueA, valueB);
printf (“After Swap “);
printf( 11 valueA = %d and valueB
%d\n11
%d\n11
,
,
valueA, valueB);
valueA, valueB);
when swapping*/
19 20 21 22 23 24 25 26
void {
int
Swap(int tempVal;
firstV al,
int secondVal)
int valueA int valueB
tempVal – firstV al; firstVal – secondVal; secondVal – tempVal;
}
Figure 16.1 The function Swap attempts to swap the values of its two parameters
/* Holds
firstV al
R6
RS
•t … 3
tempVal- –
t—– Activation record
for swap
for main
xOOOO
xFFFF
—————-•——-
A snapshot of the run-time stack when the function Swap is about to return
control to main
Figure 16.2
Run-time stack
main’s frame pointer Return address in main
Return value to main
4 firstV al
i
3
4
secondV al valu~B– –
t- Activation record
3 valueA
passes to it, it must have access to the caller function’s activation record-it must access the locations at which the arguments are stored in order to modify their values. The function Swap needs the addresses of valueA and valueB in main in order to change their values. As we shall see in the next few sections, pointers and their associated operators enable this to happen.
16-2_1 Declaring Pointer Variables
A pointer variable contains the address of a memory object, such as a variable. A pointer is said to point to the variable whose address it contains. Associated with a pointer variable is the type of object to which it points. So, for instance, an integer pointer variable points to an integer variable. To declare a pointer variable in C, we use the following syntax:
int *ptr;
Here we have declared a variable named p t r that points to an integer. The asterisk (*) indicates that the identifier that follows is a pointer variable. C programmers will often say that p t r is of type i n t star. Similarly, we can declare
char *cp; double •dp;
The variable cp points to a character and dp points to a double-precision float- ing point number. Pointer variables are initialized in a manner similar to all
– –
– – –
16.2 Pointers 429
430
chapter lb Pointers and Arrays
other variables. If a pointer variable is declared as a local variable, it will not be initialized automatically.
The syntax of declaring a pointer variable using • may seem a bit odd at first, but once we have gone through the pointer operators, the rationale behind the syntax will be more clear.
16.2.2 Pointer Operators
Chas two operators for pointer-related manipulations, the address operator & and the indirection operator •.
The Address Operator &
The address operator, whose symbol is an ampersand, &, generates the memory address of its operand, which must be a memory object such as a variable. In the following code sequence, the pointer variable ptr will point to the integer variable object. The expression on the right-hand side ofthe second assignment statement generates the memory address o f o b j e c t .
int object; int *ptr;
object~ 4; ptr ~ &object;
Let’s examine the LC-3 code for this sequence. Both declared variables are locals and are allocated on the stack. Recall that RS, the base pointer, points to the first declared local variable, or o b j e c t in this case.
AND RO, RO, #0 Clear RO ADD RO,RO,#4. RO~ 4
STR RO, R5, #0
ADD RO, R5, #0 STR RO, R5, #-1
Object ~ 4;
Generate memory address of object
Ptr ~ &object;
Figure 16.3 shows the activation record of the function containing this code after the statement ptr ~ &object; has executed. In order to make things more concrete, each memory location is labeled with an address, which we’ve arbitrarily selected to be in the xEFFO range. The base pointer RS currently points to xEFF2. Notice that object contains the integer value 4 and ptr contains the memory address of object.
The Indirection Operator *
The second pointer operator is called the indirection, or dereference, operator, and its symbol is the asterisk, * (pronounced star in this context). This operator allows us to indirectly manipulate the value of a memory object. For example, the
Figure 16.3
The run-time stack frame containing object and ptr after the statement ptr = &object has executed
RS
xEFFO L xEFF1
p t r object
xEFF2 xEFF2 4
xEFF3 xEFF4 xEFFS
expression * p t r refers to the value pointed to by the pointer variable p t r . Recall the previous example: *ptr refers to the value stored in variable object. Here, *ptr and object can be used interchangeably. Adding to the previous C code example,
int object;
int *ptr;
object= 4;
ptr = &object; *ptr == *ptr + 1;
Essentially, *ptr = *ptr + 1; is another way of saying object = object + 1;. Just as with other types ofvariables we have seen, the *ptr means different things depending on which side of the assignment operator it appears on. On the right-hand side of the assignment operator, it refers to the value that appears at that location (in this case the value 4). On the left-hand side, it specifies the location that gets modified (in this case, the address ofobject). Let’s examine the LC-3 code for the last statement in the preceding code.
LDR RO, RS, #-1 LDR Rl, RO, #0 ADD Rl, Rl, #1 STR Rl, RO, #0
RO contains the value of ptr Rl <- *ptr
*ptr + 1
*ptr = *ptr + 1;
Notice that this code is different from what would get generated if the final C statement had been object = object + 1;. With the pointer deref- erence, the compiler generates two LDR instructions for the indirection operator on the right-hand side, one to load the memory address contained in ptr and another to get the value stored at that address. With the dereference on the left- hand side, the compiler generates a STR Rl, RO, #0. Had the statement been object = *ptr + 1;, the compiler would have generated STR Rl, RS, #O.
16.2 Pointers 431
I j' j·
*firstV al;
= *secondVal;
432
chapter 16 Pointers and Arrays
16.2.3 Passing a Reference Using Pointers
Using the address and indirection operator, we can repair the Swap function from Figure 16.1 that did not quite accomplish the swap of its two input parameters. Figure 16.4 lists the same program with a revised version of swap called NewSwap.
The first modification we've made is that the parameters of Newswap are no longer integers but are now pointers to integers ( i n t *). These two parameters are the memory addresses of the two variables that are to be swapped. Within the function body of Newswap, we use the indirection operator • to obtain the values that these pointers point to.
Now when we call NewSwap from main, we need to supply the memory addresses for the two variables we want swapped, rather than the values of the variables as we did in the previous version of the code. For this, the & operator does the trick. Figure 16.5 shows the run-time stack when various statements of the function Newswap are executed. The three subfigures (A--C) correspond to the run-time stack after lines 23, 24, and 25 execute.
By design, C passes information from the caller function to the callee by value: that is, each argument expression in the call statement is evaluated, and the resulting value is passed to the callee via the run-time stack. However, in NewSwap we created a call by reference for the two arguments by using the address
10
11
12
13
14 15 16 17 18 19 20 21 22 23 24 25 26
printf {11 Before Swap 11 ) ;
printf ("valueA %d and valueB
NewSwap(&valueA, &valueB);
printf(11After Swap 11
printf( 11 valueA = %d and valueB
%d\n", valueA, valueB);
1 #include
2
3 void NewSwap{int *firstV al, int *secondVal); 4
5 int main()
6{
7
8
9
int valueA 3., int valueB 4;
void NewSwap{int int tempVal;
*secondVal}
tempVal
*firstV al
*secondV al = tempV al;
–
);
*firstV al, int /* Holds
%d\n11 ,
valueA, valueB);
}
Figure 16.4 The function NewSwap swaps the values of its two parameters
firstV al
when swapping*/
Run-time stack Run-time stack
16.2 Pointers 433 Run-lime stack
ttt
3 tempVal 3 3
xEFF3
xEFF4
xEFFS
xEFF6
xEFF7
xEFFB
xEFF9
xEFFA 3 valueA 4 4
(a) (b) (c)
Figure 16.5 Snapshots of the run-time stack when the function NewSwap executes the statements in (al line 23, (bl line 24, (cl line 25.
operator &. When an argument is passed as a reference, its address is passed to the callee function-for this to be valid, the argument must be a variable or other memory object (i.e., it must have an address). The callee function then can use the indirection operator * to access (and modify) the original value of the object.
16.2.4 Null Pointers
Sometimes it is convenient for us to say that a pointer points to nothing. Why such a concept is useful will be eminently clear to you when we discuss dynamic data structures such as linked lists in Chapter 19. For now, let us say that a pointer that points to nothing is a null pointer. In C, we make this designation with the following assignment:
int *ptr; ptr = NULL;
Here, we are assigning the value ofNULL to the pointer variable ptr. In C, NULL is a specially defined preprocessor macro that contains a value that no pointer should ever hold unless it is null. For example, N U L L might equal Oon a particular system because no valid memory object can exist at location 0.
rnain’s frame pointer Return address in main Return value to main xEFFA
xEFF9
4
main’s frame pointer Return address in main Return value to main xEFFA
xEFF9
4
main’s frame pointer Return address in main Return value to main
firstV al secondV al valueB
xEFFA xEFF9 3
I
434
chapter 16 Pointers and Arrays
16.2.5 Demystifying the Syntax
It is now time to revisit some notation that we introduced in Chapter 11. Now that we know how to pass a reference, lefs reexamine the 1/0 library function scanf:
• •
Since function s c a n f needs to update the variable i n p u t with the decimal value read from the keyboard, scanf needs the address of input and not its value. Thus, the address operator & is required. If we omit the address operator, the program terminates with an error. Cao you come up with a plausible reason why this happens? Why is it not possible for s c a n f to work correctly without the use of a reference?
Before we complete our introduction to pointers, let’s attempt to make sense of the pointer declaration syntax. To declare a pointer variable, we use a declaration of the following form:
type *ptr;
where type can be any of the predefined (or programmer-defined) types such as int, char, double, and so forth. The name ptr is simply any legal variable identifier. With this declaration, we are declaring a variable that, when the • (dereference) operator is applied to it, generates a variable of type type. That is, *ptr is oftype type.
We can also declare functions to return a pointer type (why we would want to do so will be more apparent in later chapters). For example, we can declare a function using a declaration ofthe form int *MaxSwap I).
As with all other operators, the address and indirection operator are evalu- ated according to the C precedence and associativity rules. The precedence and associativity of these and all other operators is listed in Table 12.5. Notice that both of the pointer operators have very high precedence.
16.2.6 An Example Problem Involving Pointers
Let’s examine an example problem involving pointers. Say we want to develop a program that calculates the quotient and remainder given an integer dividend and integer divisor. That is, the program will calculate dividend I divisor aod dividend% divisor where both values are integers. The structure 0£ this program is very simple aod requires only sequential constructs-that is, iteration is not required. The twist, however, is that we waot the calculation of quotient and remainder to be performed by a single C function.
We cao easily construct a function to generate a single output value (say, quotient) that we cao pass back to the caller using the return value mechanism. A function that calculates only the quotient, for example, could consist 0£ the single statement return dividend / divisor;. To provide the caller with multiple values, however, we will make use of the call by reference mechanism using pointer variables.
The code in Figure 16.6 contains a function that does just so. The function IntDivide takes four parameters, two of which are integers and two of which
s c a n f ( n %d 11
,
&input) i
21
*/
quotient, remainder) ;
1 #include
2
3 int IntDivide(int x, int y, int *quoPtr, int *remPtr); 4
5 int main()
6{
7 int dividend;
8 int divisor;
9 int quotient;
10 int remainder;
11 int error;
12
13 print£ (11 Input dividend: 11 ) ;
14 scanf( 11 %d11
15 printf (“Input divisor: “);
if (y != 0) {
*quoPtr -X I y;
/* Modify *quoPtr */ /• Modify *remPtr */
*remPtr -X % y;
return O· ‘
1
÷nd);
/* The number to be divided
/* The number to divide by */
/* Integer result of division
/* Integer remainder of division */ /* Did something go wrong? */
16 scanf(11 %d11
17
18 error= IntDivide(dividend,divisor,"ient,&remainder); 19
20 if I! error) /* !error indicates no error
&divisor);
printf ( 11 Answer: %d remainder %d\n11 , 22 else
23 printf( 11 IntDivide failed.\n 11 ) ;
24
25
26 int IntDivide(int x, int y, int *quoPtr, int *remPtr) 27
28
29
30
31
32
33 else
34 return -1;
35
Figure 16.6 The function IntDivi de calculates the integer portion and remainder of an integer divide; it returns a – 1 if the divisor is 0
are pointers to integers. The function divides the first parameter x by the second parameter y. The integer portion of the result is assigned to the memory loca- tion pointed to by quoPtr, and the integer remainder is assigned to the memory location pointed to by remPtr.
Notice that the function I n t D i v i d e also returns a value to indicate its status: It returns a -1 if the divisor is zero, indicating to the caller that an error has occurred. It returns a zero otherwise, indicating to the caller that the computation proceeded without a hitch. The function main, upon return, checks the return value to determine if the values in quotient and remainder are correct. Using the return value to signal a problem during a function call between caller and callee is an excellent defensive programming practice for conveying error conditions across a call.
,
16.2 Pointers 435
*/
*/
436 chapter 16 Pointers and Arrays
16.3 Hrra~s
Consider a program that keeps track of the final exam scores for each of the 50 students in a computer engineering course. The most convenient way to store this data would be to declare a single object, say examScore, in which wc can store 50 different integer values. We can access a particular exam score within this o~ject using an index that is an offset from the beginning of the object. For example, examscore [32] provides the exam score for the 33rd student (the very first student’s score stored in examscore [Ol). The object examscore in this example is an array of integers. An array is a collection of similar data items that are stored sequentially in memory. Specifically, all the elements in the array are of the same type (e.g., int, char, etc.).
Arrays are most useful when the data upon which the program operates is naturally expressed as a contiguous sequence of values. Because a lot of real- world data falls into this category (such as exam scores for students in a course), arrays are incredibly useful data structures. For instance, if we wanted to write a program to take a sequence of 100 numbers entered from the keyboard and sort them into ascending order, then an array would be the natural choice for storing these numbers in memory. The program would be almost impossible to write using the simple variables we have been using thus far.
16.3.1 Declaring and Using Arrays
First, let’s examine how to declare an array in a C program. Like all other variables, arrays must have a type associated with them. The type indicates the properties of the values stored in the array. Following is a declaration for an array of 10 integers:
int. grid[lO];
The keyword i n t indicates that we arc declaring something of type integer. The n a m e o f t h e a r r a y is g r i d . T h e b r a c k e t s i n d i c a t e w e a r e d e c l a r i n g a n a r r a y a n d t h e 10 indicates that the array is to contain 10 integers, all of which will be sequentially located in memory. Figure I6.7 shows a pictorial representation ofhow grid is allocated. The first element, g r i d [OJ, is allocated in the lowest memory address and the last element, grid [9], in the highest address. If the array grid were a local variable, then its memory space would be allocated on the run-time stack.
Let’s examine how to access different values in this array. Notice in Figure 16.7 that the array’s first element is actually numbered 0, which means the last element is numbered 9. To access a particular element, we provide an index within brackets. For example,
grid[6] = grid[3] + l;
The statement reads the value stored in the fourth (remember, we start num- bering with 0) element of grid, adds I to it, and stores the result into the seventh elementofgrid. Let’slookattheLC-3codeforthisexample.Let’ssaythatgrid is the only local variable allocated on the run-time stack. This means that the base pointer RS will point to g r i d [ 9 l .
Figure 16.7
The array grid allocated in memory
ADD RO, RS, #-9 LDR Rl, RO, #3 ADD Rl, Rl, #1 STR Rl, RO, #6
Put the base address of grid into RO Rl <-- grid[3]
Rl <-- grid[3] + 1 grid[6]=grid[3]+l'·
Memory
• •
•
::}:;:[______!______ grid [3] I
rid [41 Memory alloc~ted for g array grid grid [5)
grid[6J grid [7) grid [BJ grid [9]
Notice that the first instruction calculates the base address of the array, which is the address of grid [oJ, and puts it into RO. The base address of an array in general is the address of the first element of the array. We can access any element in the array by adding the index of the desired element to the base address.
The power of arrays comes from the fact that an array's index can be any legal C expression of integer type. The following example demonstrates:
gridlx+l] ~ grid[x] + 2;
Let's look at the LC-3 code for this statement. Assume x is another local variable allocated on the run-time stack directly on top o f the array g r i d .
LDR RO, RS, #-10 ADD Rl, RS, #-9 ADD Rl, RO, Rl
Load the value of X
Put the base address of grid into Rl Calculate address of grid [x]
R2 <-- grid[x]
R2<--grid[x]+2
Load the value of X
RO <-- X + 1
Put the base address of grid into Rl Calculate address of grid [x+l]
grid [x+l] = grid [x] + 2·'
LDR R2, Rl, ADD R2, R2,
LDR RO, RS, ADD RO, RO, ADD Rl,RSI
ADD Rl, RO, Rl STR R2, Rl, #0
#0 #2
#-10 #1 #-9
16.3 Arrays 437
438 chapter lb Pointers and Arrays
16.3.2 Examples Using Arrays
We start off with a simple C program that adds two arrays together by adding the corresponding elements from each array to form the sum. Each array represents a list of exam scores for students in a course. Each array contains an element for each student's score. To generate the cumulative points for each student, we effectively want to perform Tot.al [i] = Examl [i] + Exam2 [il. Figure 16.8 contains the C code to read in two 10-element integer arrays, add them together into another IO-element array, and print out the sum.
A style note: Notice the use of the preprocessor macro NUM_STUDENTS to represent a constant value of the size of the input set. This is a common use for preprocessor macros, which are usually found at the beginning of the source file (or within C header files). Now, if we want to increase the size of the array, for example if the student enrollment changes, we simply change the definition of
10
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
/* Input Exam 1 scores*/
for (i = O; i < NUM_STUDENTS; i++) {
printf( 11 Input Exam 1 score for student %d scanf ( "%d", &Examl [i]);
printf{11 \n11 );
/* Input Exam 2 scores*/
for (i = O; i < NUM_STUDENTS; i++) {
printf("Input Exam 2 score for student %d scanf ("%d", &Exam2 [i] ) ;
p r i n t f { 11 \ n 11 ) ;
/* Calculate Total Points*/
for (i = 0; i < NUM_STUDENTS; i++) {
T otal[i] = Examl[i] + Exam2[il; }
/• Output the Total Points*/ for(i=0;i
2 #define NUM STUDENTS 10 3
4 int main()
5{
6
7
8
9
inti;
i n t i n t in t
Examl[NUM_STUDENTS]; Exam2[NUM_STUDENTS]; Total[NUM_STUDENTS];
printf (“Total for Student %d = %d\n”, i, Total [ii);
Figure 16.8 AC program that calculates the sum of two 10-element arrays
the macro (one change) and recompile the program. If we did not use the macro, changing the array size would require changes to the code in multiple places. The changes could be potentially difficult to track down, and forgetting to do one would likely result in a program that did not work correctly. Using preprocessor macros for the size of an array is good programming practice.
Now onto a slightly more complex example involving arrays. Figure 16.9 lists a C program that reads in a sequence of decimal numbers (in total MAX_NUMS of them) from the keyboard and determines the number of times each input number is repeated within the sequence. The program then prints out each number, along with the number of times it repeats.
In this program, we use two arrays, numbers and repeats. Both are declared to contain MAX_NUMS integer values. The array n u m b e r s stores the input sequence. The array r e p e a t s is calculated by the program to contain the number o f times the corresponding element in numbers is repeated in the input sequence. For example,ifnumbers [3J equals 115,andthereareatotaloffour 115sintheinput
1 #include
4 int main()
5{
6
7
8
9
10
11
12
13
14
15
16 } 17
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
/* Scan through entire array, counting number of */ /* repeats per element within the original array */ for (index= O; index< MAX_NUMS; index++) {
repeats[index] = O;
for (repindex = O; repindex < MAX_NUMS; repindex++) {
int index;
int repindexi
int numbers[MAX_NUMSJ; /* Original input numbers */ int repeats[MAX_NUMSJ; /* Number of repeats
/* Get input */
printf ("Enter %d numbers. \n", MAX NUMS);
for (index= O; index< MAX_NUMS; index++) {
printf (11 Input number %d : 11 , scanf ("%d", &numbers [index]);
index) ;
if (numbers[repindex] repeats[index]++;
numbers[index])
/* Loop iteration variable */ I* Loop variable for rep loop */
/* Print the results*/
for (index= O; index< MAX_NUMS; index++)
printf (11 0riginal number %-d. Number of repeats %d\n11 numbers[index], repeats[index]);
1
Figure 16.9 A C program that determines the number of repeated values in an array
16.3 Arrays 439
*/
440
chapter 16 Pointers and Arrays
•
sequence (i.e., there are four 115s in the array numbers), then repeats [3] will equal 4.
This program consists of three outer loops, of which the middle loop is actu- ally a nested loop (see Section 13.3.2) consisting of two loops. The first and last f o r loops are simple loops that get keyboard input and produce program output.
The middle for loop contains the nested loop. This body of code deter- mines how many copies of each element exist within the entire array. The outer loop iterates the variable i n d e x from O through MAX_NUMS; we use index to scan through the array from the first element numbers [oJ through the last element n u m b e r s [MAX_NUMSJ. The inner loop also iterates from 0 through MAX_NUMS; we use this loop to scan through the array again, this time determining how many of the elements match the element selected by
the outer loop (i.e., numbers [index]). Each time a copy is detected (i.e., numbers [repindex] == numbers [index]), the corresponding element in the repeats array is incremented (i.e., repeats [index]++).
16.3.3 Arrays as Parameters
Passing arrays between functions is a useful thing because it allows us to create functions that operate on arrays. Say we want to create a set of functions that calculates the mean and median on an array of integers. We would need either (1) to pass the entire array of values from one function to another or (2) to pass a reference to the array. If the array contains a large number of elements, copying each element from one activation record onto another could be very costly in execution time. Fortunately, C naturally passes arrays by reference. Figure 16.10 is a C program that contains a function Average whose single parameter is an array of integers.
When calling the function A verage from main, we pass to it the value asso- ciated with the array identifier numbers. Notice that here we are not using the standard notation involving brackets [ l that we normally use for arrays. In C, an array's name refers to the address of the base element of the array. The name numbers is equivalent to &numbers IOJ. The type numbers is similar to int *. It is an address of memory location containing an integer.
In using numbers as the argument to the function A verage, we are causing the addressofthearraynumbers tobepushedontothestackandpassedtothefunction A verage. Within the function A verage, the parameter inputV alues is assigned the address ofthe array. Within A verage we can access the elements ofthe original array using standard array notation. Figure 16.11 shows the run-time stack just prior to the execution of the return from Average (line 34 of the program).
Notice how the input parameter i n p u t V a l u e s is specified in the declaration of the function A verage. The brackets [ l indicate to the compiler that the corresponding parameter will be the base address to an array of the specified type, in this case an array of integers.
Since arrays arc passed by reference in C, any modifications to the array values made by the called function will be visible to the caller once control returns to it. How would we go about passing only a single element of an array by value? How about by reference?
1 #include
2 #define MAX NUMS 10
3
4 int Average(int input_values[]); 5
6 int main()
7{
8
9
10
int index;
int mean;
i n t numbers[MAX_NUMS];
/* Loop iteration variable */ /* Average of numbers */ /* Original input numbers •/
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32 } 33
34 return (sum/ MAX_NUMS); 35 }
Figure 16.10 An example of an array as a parameter to a function
16.3.4 Strings in C
/• Get input */
printf(“Enter %d numbers.\n 11
for (index= D; index< MAX_NUMS; index++)
,
printf (11 Input number %d : 11 ,
scanf (11 %d11 , &numbers [index] ) ; }
index) ;
mean= A verage(numbers);
printf( 11 The average of these numbers is %d\n11 ,
mean);
int A verage(int
int index; int sum= O;
inputV alues[])
for (index= 0; index< MAX_NUMS; index++) { sum= sum+ inputV alues[index];
A very common use for arrays in C is for strings. Strings are sequences of charac- ters that represent text. Strings are simply character arrays, with each subsequent element containing the next character of the string. For example,
char word[l0];
declares an array that can store a string of up to 10 characters. Longer strings require a larger array. What if the string is shorter than 10 characters? In C and many other modem programming languages, the end of a string is denoted by the null character whose ASCII value is 0. It is a sentinel that identifies the end of the string. Such strings are also called null-terminated strings. ' \ o' is the special
MAX_NUMS);
16.3 Arrays 441
442
chapter 16 Pointers and Arrays
R6
RS ..
XEFEB XEFEC XEFED XEFEE XEFEF XEFFO X EFF1
X EFF2 X EFF3 XEFF4 XEFFS XEFF6 XEFF? XEFF8
XEFF9 XEFFA
10
main's frame pointer Return address in main Return value to main
xEFEF 9 15 14 236 3 67 48 18
23 56 ?? 10
sum index
inputV alues
numbers [O] numbers [l] numbers[2] numbers[3J numbers[4] numbers [SJ numbers [6] numbers [7] numbers [8]
numbers [9] mean
index
Activation record for A verage
.. t 489
Run-time stack
Figure 16.11 The run-time stack prior to the execution of the return from A v e r a g e
sequence that corresponds to the null character. Continuing with our previous declaration,
char wordflOJ;
word[OJ word[l] word[2] word [3] word [4] word[5J
'H '; IeIi ,l'; fl'; ,o'; ,\0';
printf(11 %s11 , word);
Activation record
for maj n
1 6 . 3 Arrays 443
Here, we are assigning each element of the array individually. The array will contain the string "Hello." Notice that the end-of-string character itself is a character that occupies an element of the array. Even though the array is declared for 10 elements, we must reserve one element for the null character, and therefore strings that are longer than nine characters cannot be stored in this array.
We have also used a new p r i n t £ format specification %sin this example. This specification prints out a string of characters, starting with the character pointed to by the corresponding parameter and ending at the end-of-string character ' \ o' .
ANSI C compilers also allow strings to be initialized within their declarations. For instance, the preceding example can be rewritten to the following.
char word(lO] = 11 Hello11 ;
printf(11 %sn, word);
Make note of two things here: First, character strings are distinguished from single characters with double quotes, " ". Single quotes are used for single characters. such as ' A · . Second, notice that the compiler automatically adds the null character to the end of the string.
Examples of Strings
Figure 16.12 contains a program that performs a very simple and useful primitive operation on strings: it calculates the length of a string. Since the size of the array that contains the string does not indicate the actual length of the string (it does, however, tell us the maximum length ofthe string), we need to examine the string itself to calculate its length.
The algorithm for determining string length is easy. Starting with the first element, we count the number ofcharacters before we encounter the null character. Thefunction stringLength inthecodeinFigure16.12performsthiscalculation.
Notice that we are using the format specification %sin the s c a n f statement. Thisspecificationcausesscant toreadinastringofcharactersfromthekeyboard until the first white space character. In C, any space, tab, new line, carriage return, vertical tab, or form-feed character is considered white space. So if the user types
(from The New Colossus, by Emma Lazarus)
Not like the brazen giant of Greek fame,
With conquering limbs astride fI·om land to land;
only the word Not is stored in the array input. The remainder of the text line is reserved for subsequent scan£ calls to read. So if we performed another scanf ( "%s", input), the word like will be stored in the array input. Notice that the white space is automatically discarded by this %s specification. We exam- ine this UO behavior more closely in Chapter 18 when we take a deeper look into I/0 in C.
Notice that the maximum word size is 20 characters. What happens if the first word is longer? The scan£ function has no information on the size of the array i n p u t and will keep storing characters to the array address it was provided until white space is encountered. So what then happens if the first word is longer
444
chapter 16 Pointers and Arrays
1 2 3 4 5 6 7 8 9
10
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
#include
int StringLength(char string[]); int main()
char input[MAX_STRING]; int length= O;
/* Input string*/
printf (11 Input a word (less than 20 characters): 11 ) ; scanf(11 %s11 , input);
length= StringLength(input);
printf (11 The word contains %d characters\n11 ,
int StringLength(char string[]) {
int index= O;
while (string[index] != ‘\0 ‘)
index= index+ 1; return index;
}
length);
Figure 16.12 A program that calculates the length of a string
than 20 characters? Any local variables that are allocated after the array input in the function main will be overwritten. Draw out the activation record before and after the call to scanf to see why. In the exercises at the end of this chapter, we provide a problem where you need to modify this program in order to catch the scenario where the user enters a word longer than what fits into the input array.
Let’s examine a slightly more complex example that uses the stringLength function from the previous code example. In this example, listed in Figure 16.13, we read an input string from the keyboard using scanf, then call a function to reverse the string. The reversed string is then displayed on the output device.
The function Reverse performs two tasks in order to reverse the string prop- erly. First it determines the length o f the string to reverse using the S t r i n g L e n g t h function from the previous code example. Then it performs the reversal by swap- ping the first character with the last, the second character with the second to last, the third character with the third to last, and so on.
To perform the swap, it uses a modified version ofthe NewSwap function from Figure 16.4. The reversal loop calls the function CharSwap on pairs of characters within the string. First, CharSwap is called on the first and last character, then on the second and second to last character, and so forth.
The C standard library provides many prewritten functions for strings. For example, functions to copy strings, merge strings together, compare them, or
1 #include
2 #define MAX STRING 20
3
4 int StringLength(char string[]);
5 void CharSwap(char *firstV al, char *secondV al); 6 void Reverse(char string[]);
7
B int main()
9{
10
11
12
13
14
15
16
17
18
19
20 { 21
/* Input string*/
char input[MAX_STRING];
printf (“Input a word (less than 20 characters) , “) ;
scanf(11 %s11
1
input)
Reverse (input);
printf (“The word reversed is %s. \n 11 ,
int StringLength(char string[])
int index= O;
input);
22
23
24
25
26
27
28
29
30 {
31 char tempVal; 32
while (string[index] != ‘\0’)
index= index+ l; return index;
void CharSwap(char
*firstV al, char
I* Temporary location for swapping*/
33 tempVal = *firstV al;
34 *firstVal = *secondVal;
35 *secondV al = tempV al;
36
37
38 void Reverse(char string[]) 39 {
4O int index;
41 int length;
42
43 length= StringLength(string); 44
45 for (index= O; index< (length/ 2); index++)
46 CharSwap(&string[index], &string[length - (index+ 1)]);
47
Figure 16.13 A program that reverses a string
•secondVal)
16.3 Arrays 445
446 chapter 16
Pointers and Arrays
Table 16.l y ·, ,Jelationship Between Pointers and Arrays
cpti- word &word[O] (cptr + n) word + n &word(n]
*cptr *word word [OJ *(cptr + n) *(word + n) wor.·d[n]
calculate their length can be found in the C standard library, and the declara- tions for these functions can be included via the < s t r i n g . h> header file. More information on some of these string functions can be found in Appendix D.9.2.
16.3.5 The Relationship Between Arrays and Pointers in C
You might have noticed that there is a similarity between an array’s name and a pointer variable to an element of the same type as the array. For instance,
char word[lO]; char *cptr;
cptr == word;
is a legal, and sometimes useful, sequence of code. Here, we have assigned the pointer variable cptr to point to the base address of the array word. Because they are both pointers to characters, cpt.r and word can be used interchangeably. For example, we can access the fourth character within the string either by using word [3] or* (cptr + 3).
One difference between the two, though, is that c p t r is a variable and can be reassigned. The array identifier word, on the other hand, cannot be. For example, thefollowingstatementisillegal:word – newArray.Theidentifieralwayspoints to a fixed spot in memory where the compiler has placed the array. Once it has been allocated, it cannot be moved.
Table 16.1 shows the equivalence of several expressions involving pointer and array notation. Rows in the table are expressions with the same meaning.
16.3.6 Problem Solving: Insertion Sort
With this initial exposure to arrays under our belt, we can now attempt an inter- esting and sizeable (and useful!) problem: we will write C code to sort an array of integers into ascending order. That is, the code arranges the array a [J such that a[OJ :Sa[l] Sa[2] ….
To accomplish this, we will use an algorithm for sorting called Insertion Sort. Sorting is an important primitive operation, and people in computing have devoted considerable time to understanding, analyzing, and refining the sorting process. As a result, there are many algorithms for sorting, and you will gain exposure to some basic techniques in subsequent computing courses. We use insertion sort
here because it parallels how we might sort items in the real world. It is quite straightforward.
Insertion sort is best described by an example. Say you want to sort your compact disc collection into alphabetical order by artist. If you were sorting your compact discs using insertion sort, you would split the CDs into two groups, the sorted group and the unsorted group. Initially, the sorted group would be empty as all your CDs would be yet unsorted. The sorting process proceeds by taking a CD from the unsorted group and inserting it into the proper position among the sorted CDs. For example, if the sorted group contained three CDs, one by John Coltrane, one by Charles Mingus, and one by Thclonious Monk, then inserting the Miles Davis CD would mean inserting it between the Coltrane CD and the Mingus CD. You keep doing this until all CDs in the unsorted group have been inserted into the sorted group. This is insertion sort.
How would we go about applying this same technique to sort an array of integers? Applying systematic decomposition to the preceding algorithm, we see that the core of the program involves iterating through the elements of the array, inserting each element into the proper spot in a new array where all items are in ascending order. This process continues until all elements of the original array have been inserted into the new array. Once done, the new array will contain the same clements as the first array, except in sorted order.
For this technique we basically need to represent two groups of items, the original unsorted elements and the sorted elements. And for this we could use two separate arrays. It turns out, however, that we can represent both groups of elements within the original array. Doing so results in code that requires less memory and is more compact, though slightly more complex upon first glance. The initial part of the array contains the sorted elements and the remainder of the array contains the unsorted elements. We pick the next unsorted item and insert it into the sorted part at the correct point. We keep doing this until we have gone through the entire array.
Theactual InsertionSort routine(showninFigure 16.14)containsanested loop. The outer loop scans through all the unsorted items (analogous to going through the unsorted CDs, one by one). The inner loop scans through the already sorted items, scanning for the place at which to insert the new item. Once we detect an already sorted element that is larger than the one we are inserting, we insert the new element between the larger and the one before it.
Let’s take a closer look by examining what happens during a pass of the insertion sort. Say we examine the insertion sort process (lines 33~3) when the variable unsorted is equal to 4. The array list contains the following 10 elements:
2 16 69 92 15 37 92 38 82 19
During this pass, the code inserts list [4], or 15, into the already sorted portion ofthe array, elements list [OJ through list [3].
Theinnerloopiteratesthevariablesorted throughthelistofalreadysorted elements. It does this from the highest numbered element down to O(i.e., starting at 3 down to 0). Notice that the condition on the for loop terminates the loop once a list item less than the current item, 15, is found.
16.3 Arrays 447
448
chapter 16 Pointers and Arrays
1 #include
2 #define MAX NUMS 10
3
4 void InsertionSort(int list[]); 5
6 int main()
7{
8
9
10
11
12
13
14
15
16 } 17
18
19
20
21
22
23
24 } 25
InsertionSort(numbers); /* Call sorting routine
/* Print sorted list*/
printf(“\nThe input set, in ascending order:\n”};
for (index= 0; index< MAX_NUMS; index++) printf ("%d\n", numbers [index]);
*/
int index;
/* Iteration variable
/* List of numbers to be sorted*/
in t numbers[MAX NUMS];
/* Get input */
printf {11 Input number %d : 11 , index) ; scanf ("%d", &numbers [index]);
printf( 11 Enter %d numbers.\n 11 , MAX_NUMS);
for (index= 0; index< MAX_NUMS; index++) {
26 void InsertionSort(int list[]) 27
28 29 30 31 32 33 34 35 36 37 38 39
41
42
43
44
45 }
int unsorted;
int sorted;
int unsorteditem;
/* Index for unsorted list items*/
/* Index for sorted items
/* Current item to be sorted
*/
*/
/* This loop iterates from 1 thru MAX NUMS */
for (unsorted= l; unsorted< MAX_NUMS; unsorted++)
unsorteditem = list[unsorted];
/* This loop iterates from unsorted thru 0, unless
we hit an element smaller than current item*/
for (sorted= unsorted - l;
(sorted>= 0) && (list[sorted] > unsorteditem); 40 sorted–)
list[sorted + l] = list[sorted];
list[sorted + l] = unsorteditem; /* Insert item
*/
Figure 16.14 Insertion sort program
*/
In each iteration of this inner loop (lines 38-41), an element in the sorted part of the array is copied to the next position in the array. In the first iteration, list f3l is copied to list [4J. So after the first iteration ofthe inner loop, the array l i s t contains
2 16 69 92 92 37 92 38 82 19
Notice that we have overwritten 15 (list [4J ). This is OK because we have a copy of its value in the variable unsorteditem (from line 34). The second iteration performs the same operation on 1 i s t [ 2 l . After the second iteration, 1 i s t contains
2 16 69 69 92 37 92 38 82 19
After the third iteration, 1 i s t contains: 2 16 16 69 92 37 92 38 82 19
Now the for loop terminates because the evaluation condition is no longer true. More specifically, list [sorted] > unsorteditem is not true. The cur- rent sorted list item l i s t [OJ, which is 2, is not larger than the current unsorted item unsortedrtem, which is 15. Now the inner loop terminates, and the statement following it, list [sorted + l] = unsorteditem; exe- cutes. Now l i s t contains, and the sorted part of the array contains, one more element.
2 15 16 69 92 37 92 38 82 19
This process continues until all items have been sorted, meaning the outer loop has iterated through all clements of the array l i s t .
16.3.7 Common Pitfalls with Arrays in C
Unlike some other modern programming languages, C does not provide protection against exceeding the size (or bounds) of an array. It is a common error made with arrays in C programming. C provides no support for ensuring that an array index is actually within an array. The compiler blindly generates code for the expression a [i l , even if the index i accesses a memory location beyond the end of the array. To demonstrate, the code in Figure 16.15 lists an example of how exceeding the array bounds can lead to a serious debugging effort. Enter a number larger than the array size and this program exhibits some peculiar behavior. 1
Analyze this program by drawing out the run-time stack and you will see more clearly why this bug causes the behavior it does.
C does not perform bounds checking on array accesses. C code tends to be faster because array accesses incur less overhead. This is yet another manner in
!;
1
might need to declare index after array in order to observe the problem.
Depending on the compiler you are using, you might need to enter a number larger than 16, or you
1 6 . 3 Arrays 449
450 chapter 16 Pointers and Arrays
10 11 12 13 14 15 16 17
printf (“Enter limit (integer): “);
1 #include
4 int main()
5{
6
7
8
9
int index;
int array[MAX_SIZE]; int limit;
scanf (11 %d11
,
& lim it);
for(index = O; index< lim it; index++) { array [index] = O;
printf ("array[%d] is set to 0\n", index);
Figure 16.15 This C program has peculiar behavior if the user enters a number that is too large
which C provides more control to the programmer than other languages. If you are not careful in your coding, this bare-bones philosophy can, however, lead to undue debugging effort. To counter this, experienced C programmers often use some specific defensive programming techniques when it comes to arrays.
Another common pitfall with arrays in C revolves around the fact that arrays (in particular, statically declared arrays such as the ones we've seen) must be of a fixed size. We must know the size of the array when we compile the program. C does not support array declarations with variable expressions. The following code in C is illegal. The size of array temp must be known when the compiler analyzes the source code.
void SomeFunction(int num elements) {
int temp[num_elements]; /* Generates a syntax error*/
To deal with this limitation, experienced C programmers carefully analyze the situations in which their code will be used and then allocate arrays with ample space. To supplement this built-in assumption in their code, bounds checks are added to warn if the size of the array is not sufficient. Another option is to use dynamic memory allocation to allocate the arr.iy at run-time. More on this in Chapter 19.
16.4 Summarq
In this chapter we covered two important high-level programming constructs: pointers and arrays. Both constructs enable us to access memory indirectly. The key notions we covered in this chapter are:
• Pointers. Pointers are variables that contain addresses of other memory objects (such as other variables). With pointers we can indirectly access and manipulate these other objects. A very simple application of pointers is to use them to pass parameters by reference. Pointers have more substantial applications, and we will see them in subsequent chapters.
• Arrays.Anarrayisacollectionofelementsofthesametypearrangedsequen- tially in memory. We can access a particular element within an array by providing an index to the element that is its offset from the beginning of the array. Many real-world objects are best represented within a computer program as an array of items, thus making the array a significant structure for organizing data. With arrays, we can represent character strings that hold text data, for example. We examine several important array operations, including the sorting operation via insertion sort.
16.1 Write a C function that takes as a parameter a character string of unknown length, containing a single word. Your function should translate this string from English into Pig Latin. This translation is performed by removing the first letter of the string, appending it onto the end, and concatenating the letters ay. You can assume that the array contains enough space for you to add the extra characters.
For example, if your function is passed the string "Hello," after your function returns, the string should have the value "elloHay." The first character of the string should be "e."
16.2 Write a C program that accepts a list of numbers from the user until a number is repeated (i.e., is the same as the number preceding it). The program then prints out the number of numbers entered (excluding the last) and their sum. When the program is run, the prompts and responses will look like the following:
Number: 5
Number: -6
Number: 0
Number: 45
Number: 45
4 numbers were entered and their sum is 44
Exemses
Exercises
451
452 chapter 16 Pointers and Arrays
16.3 What is the output when the following code is compiled and run?
int x;
int main()
int *px = &x; int X 7;
*px 4;
print£ (11 x %d\n", x);
16.4 Create a string function that takes two input strings, s t r i n g A and stringB, and returns a Oif both strings are the same, a I if stringA appears before stringB in the sorted order of a dictionary, or a 2 if stringB appears before stringA.
16.5 Using the function developed for Exercise 16.4, modify the Insertion Sort program so that it operates upon strings instead of integers.
16.6 Translate the following C function into LC-3 assembly language.
int main()
{
int a[S], i;
i = 4;
while (i >– O)
a[i] – i; i–;
16.7
For this question, examine the following program. Notice that the variable ind is a pointer variable that points to another pointer variable. Such a construction is legal in C.
#include
int main() {
int apple;
int *ptr;
int **ind; ind= &ptr; *ind= &apple; **ind= 123;
ind++; *ptr++; apple++;
printf( 11 %x %x %d\n”, ind, ptr, apple);
Analyze what this program performs by drawing out the run-time stack at the point just after the statement apple++; executes.
The following code contains a call to the function t r i p l e . What is the minimum size ofthe activation record of triple?
int main()
{
}
16.8
16.9
16.10
Write a program to remove any duplicates from a sequence of numbers. For example, if the list consisted of the numbers 5, 4, 5, 5, and 3, the program would output 5, 4, 3.
Write a program to find the median of a set of numbers. Recall that the median is a number within the set in which half the numbers are larger and half are smaller. Hint: To perform this, you may need to sort the list first.
int array[3];
array [OJ 1; array [l] 2; array[2] 3;
triple (array);
Exercises 453
454
chapter 16 Pointers and Arrays
16.11 For this question, refer to the following C program: int FindLen(char *);
int main() {
char str [10] ;
printf( 11 Enter a string : 11 ) ;
int FindLen(char * s) {
int len~o;
while (*s != ‘\0’} { len++;
s++;
return len;
a. For the preceding C program, what is the size of the activation record for the functions main and FindLen?
b. Show the contents of the stack just before the function Fi ndLen returns if the input string is apple.
c. What would the activation record look like if the program were run and the user typed a string of length greater than 10 characters? What would happen to the program?
str);
scanf(“%s 11
printf( 11 %s has %d characters\n”, str, FindLen(str));
}
,
16.12
The following code reads a string from the keyboard and prints out a version with any uppercase characters converted to lowercase. However, it has a flaw. Identify it.
#include
char *LowerCase(char *s);
int main() {
char str[MAX_LEN];
printf(“Enter a string n); scanf(“%su, str);
printf (11 Lo_wercase: %s \n11 , Lowercase (str) };
char *LowerCase(char *s) char newStr[MAX_LEN]; int index;
for (index O; index < MAX_LEN; index++I { if ('A' <= s[index] && s[index] <= 'Z')
16.13
Consider the following declarations. #define STACK_SIZE 100
int stack[STACK_SIZE]; int topOfStack;
int Push(int item);
newStr[index] s[index] else
newStr[index] s[index]; return newStr;
}
+ ('a' -
'A ');
a. Write a funtion Push (the declaration is provided) that will push the value of i tern onto the top of the stack. If the stack is full and the item cannot be added, the function should return a 1. If the item is successfully pushed, the function should return a 0.
b. Write a function Pop that will pop an item from the top of the stack. Like Push, this function will return a 1 if the operation is unsuccessful. That is, a Pop was attempted on an empty stack. It should return a Oif successful. Consider carefully how the popped value can be returned to the caller.
Exercises 455
Recursion
17.l Introduction
We start this chapter by describing a recursive procedure that you might already be familiar with. Suppose we want to find a particular student's exam in a set of exams that are already in alphabetical order. We might randomly examine the name on an exam about halfway through the set. If that randomly chosen exam is not the one we are looking for, we search the appropriate half using the very same technique. That is, we repeat the search on the first half or the second half, depending on whether the name we are looking for is less than or greater than the name on the exam at the halfway point. For example, say we are looking for Babe Ruth's exam and, al the halfway point, we find Mickey Mantle's exam. We then repeat the search on the second half of the original stack. Fairly quickly, we will locate Babe Ruth's exam, if it exists in the set. This technique of searching through a set of elements already in sorted order is recursive. We are applying the same searching algorithm to continually smaller and smaller subsets of exams.
The idea behind recursion is simple: A recursive function solves a task by calling itself on a smaller subtask. As we shall see, recursion is another way of expressing iterative program constructs. The power of recursion lies in its abil- ity to elegantly capture the flow of control for certain tasks. There are some programming problems for which the recursive solution is far simpler than the corresponding solution using conventional iteration. In this chapter, we introduce you to the concept of recursion via five different examples. We examine how recursive functions are implemented on the LC-3. The elegance of the run-time stack mechanism is that recursive functions require no special handling-they
chapter
17
458
chapter 17 Recursion
execute in the same manner as any other function. The main purpose of this chap- ter is to provide you with an initial but deep exposure to recursion so that you can analyze and reason about recursive programs. Being able to understand recursive code is a necessary ingredient for writing recursive code, and ultimately for recur- sion to become part of your problem-solving toolkit for attacking programming problems.
17.2 WhatIsRecursion?
A function that calls itself is a recursive function, as in the function RunningSum in Figure 17.1.
This function calculates the sum of all the integers between the input param- eter n and 1. For example, Runningsum (4 I calculates 4 +3 +2 +1. However, it does the calculation recursively. Notice that the running sum of 4 is really 4 plus the running sum of 3. Likewise, the running sum of 3 is 3 plus the running sum of 2. This recursive definition is the basis for a recursive algorithm. In other words,
RunningSum(n) = n + RunningSum(n - 1)
In mathematics, we use recurrence equations to express such functions. The preceding equation is a recurrence equation for RunningSum. In order to complete the evaluation of this equation, we must also supply an initial case. So in addition to the preceding formula, we need to state
RunningSum(l) = 1
before we can completely evaluate the recurrence, which we do as follows:
RunningSum(4) = 4 + RunningSum(3)
= 4 + 3 + RunningSum(2)
= 4 + 3 + 2 + RunningSum(l)
=4+3+2+1
The C version of Runningsurn works in the same manner as the recurrence equa- tion. During execution of the function call RunningSum (4 I, RunningSum makes a function call to itself, with an argument of 3 (i.e., Running.Sum (3) ). However, before Running sum (3) ends, it makes a call to RunningSum (2) . And before Runningsum (2) ends, it makes a call to Running.Sum (1). Running.Sum (1), however, makes no additional recursive calls and returns the value I to
1
2
3
4
int RunningSum(int n)
if (n -- 1) return li
5 else
6 return (n + RunningSum(n-1)); 7}
Figure 17.1 A recursive function
Runningsum ( 4)
{
'
r'eturn (4 + RunningSum(3)); \
}
Figure 17.2
The ftow of control when RunningSum (4) is called
'--- Step6
Step 1 ..
RunningSum ( 3 ) {
'
r'eturn (3 + RunningSum(2));
Return value 6
}
\
S-tep2
RunningSum.(2)
{
' '
}
Steps
Return value 3
return (2 + RunningSum{l)); \
'--- Step4
RunningSum (2) , which enables Runn ingSum (2) to end, and return the value 2 + I back to RunningSum (3) . This enables RunningSum (3) to end and pass a value of 3 +2 +1 to RunningSum (4). Figure 17.2 pictorially shows how the execution of Runningsum (4) proceeds.
17.3 RecursionversusIteration
Clearly,wecouldhavewrittenRunningSumusingafor loop,andthecodewould have been more straightforward than its recursive counterpart. We provided a recursive version here in order to demonstrate a recursive call in the context of an easy-to-understand example.
There is a parallel between using recursion and using conventional iteration (such as for and while loops) in programming. All recursive functions can be written using iteration. For certain programming problems, however, the recursive version is simpler and more elegant than the iterative version. Solutions to certain problems are naturally expressed in a recursive manner, such as problems that are expressed with recurrence equations. It is because of such problems that recursion is an indispensable programming technique. Knowing which problems require recursion and which are better solved with iteration is part of the art of computer programming; you will become better at when lo use which with experience.
17.3 Recursion versus Iteration
459
Return value 1
{
}
return l;
RunningSum(l)
Step3 '
460
chapter 17 Recursion
Recursion, as useful as it is, comes at a cost. As an experiment, write an iterative version of RunningSum and compare the running time for large n with the recursive version. To do this you can use library functions to get the time of day (for example, gettimeofday) before the function starts and when it ends. Plot the running time for a variety of values of n and you will notice that the recursive version is relatively slow (provided the compiler did not optimize away the recursion). As we shall see in Section 17.5, recursive functions incur function call overhead that iterative solutions do not.
17.4 TornersofHanoi
One problem for which the recursive solution is the simpler solution is the classic puzzle Towers of Hanoi. The puzzle involves a platform with three posts. On one of the posts sit a number of wooden disks, each smaller than the one below it. The objective is to move all the disks from their current post to one of the other posts. However, there are two rules for moving disks: only one disk can be moved at a time, and a larger disk can never be placed upon a smaller disk. For example, Figure 17.3 shows a puzzle where five disks are on post 1. To solve this puzzle, these five disks must be moved to one of the other posts obeying the two rules.
As the legend associated with the puzzle goes, when the world was created, the priests at the Temple of Brahma were given the task of moving 64 disks from one post to another. When they completed their task, the world would end.
Now how would we go about writing a computer program to solve this puzzle? If we view the problem from the end first, we can make the fol- lowing observation: the final sequence of moves must involve moving the largest disk from post 1 to the target post, say post 3, and then moving the other disks back on top of it. Conceptually, we need to move all n - 1 disks off the largest disk and onto the intermediate post, then move the largest disk from its post onto the target post. Finally, we move all n - 1 disks from the intermediate post onto the target post. And we are done! Actually, we are not quite done because moving n - 1 disks in one move is not legal. However, we have stated the problem in such a manner that we can solve it if we can solve the
Figure 17.3
The Towers of Hanoi puzzle
II I
I II
II
I
Post 1
Post 2
Post 3
' '
--
/*
** Inputs
** diskNumber is the disk to be moved (diskl is smallest)
** startPost is the post the disk is currently on
** endPost is the post we want the disk to end on
** midPost is the intermediate post
*I
MoveDisk(diskNumber, startPost, endPost, midPost) {
if (diskNumber > 1) {
/* Move n-1 disks off the current disk on
/* startPost and put them on the midPost MoveDisk(diskNumber-1, startPost, midPost, endPost);
/* Move the largest disk. */ printf (“Move disk %d from post %d to post %d. \n”,
diskNumber, startPost, endPost};
/* Move all n-1 disks from midPost onto endPost */ MoveDisk(diskNumber-1, midPost, endPost, startPost);
else
}
Figure 17.4 A recursive function to solve the Towers of Hanoi puzzle
two smaller subproblems of it. Once the largest disk is on the target post, we do not need to deal with it any further. Now the n – 1th disk becomes the largest disk, and the subobjective becomes to move it to the target pole. We can therefore apply the same technique but on a smaller subproblem.
We now have a recursive definition of the problem: In order to move n disks to the target post, which we symbolically represent as Move (n, target), we first moven- 1diskstotheintermediatepost-Move(n-1, intermediate)-then move the nth disk to the target, and finally move n – I disks from the intermediate to the target, or Move(n-1, target). So in order to Move(n, target), two recursive calls are made to solve two smaller subproblems involving n – I disks.
As with recurrence equations in mathematics, all recursive definitions require a base case, which ends the recursion. In the way we have formulated the problem, the base case involves moving the smallest disk (disk 1). Moving disk 1 requires no other disks to be moved since it is always on top and can be moved directly from one post to any another without moving any other disks. Without a base case, a recursive function would have an infinite recursion, similar to an infinite loop in conventional iteration.
Taking our recursive definition to C code is fairly straightforward. Figure 17.4 is a recursive C function of this algorithm.
Let’s see what happens when we play a game with three disks. Following is an initial function call to MoveDisk. We start off by saying that we want to move
printf( 11 Move disk 1 from post %d to post %d.\n”, startPost, endPost);
17.4 Towers of Hanoi 4bl
*I
*/
462
chapter 17 Recursion
Figure 17.5
I2l
r31 2323i
The Towers of Hanoi Figure 17.6 The Towers of Hanoi puzzle, initial puzzle, after first move configuration
disk 3 (the largest disk) from post 1 to post 3, using post 2 as the intermediate storage post. That is, we want to solve a three-disk Towers of Hanoi puzzle. See Figure 17.5.
/* diskNurnber 3; startPost l; endPost 3; midPost 2 */ MoveDisk(3, 1, 3, 2)
This call invokes another call to MoveDi sk to move disks 1 and 2 off disk 3 and onto post 2 using post 3 as intermediate storage. The call is performed at line 15 in the source code.
/* diskNurnber 2; startPost l; endPost 2; midPost 3 */ MoveDisk(2, 1, 2, 3)
To move disk 2 from post I to post 2, we must first move disk 1 off disk 2 and onto post 3 (the intermediate post). So this triggers another call to MoveDisk again from the call on line 15.
/* diskNurnber l; startPost l; endPost 3; midPost 2 */ MoveDisk(l, 1, 3, 2)
Since disk I can be directly moved, the second printf statement is executed. See Figure 17.6.
Move disk number 1 from post 1 to post 3.
Now, this invocation of MoveDisk returns to its caller, which was the call MoveDisk I2, 1, 2, 3). Recall that we were waiting for all disks on top of disk 2 to be moved to post 3. Since that is now complete, we can now move disk 2 from post 1 to post 2. The p r i n t f is the next statement to execute, signaling another disk to be moved. See Figure 17.7.
Move disk number 2 from post 1 to post 2.
Next, a call is made to move all disks that were on disk 2 back onto disk 2. This happens at the call on line 22 of the source code for MoveDisk.
/* diskNurnber l; startPost 2; endPost 3; midPost 1 */ MoveDisk(l, 2, 3, 1)
Figure 17.9
~
r1
3213 123123
Figure 17.7
The Towers of Hanoi Figure 17.8 The Towers of Hanoi puzzle, after second puzzle, after third move move
23123 23123
The Towers of Hanoi Figure 17.10 The Towers of Hanoi puzzle, after fourth puzzle, after fifth move move
Again, since disk 1 has no disks on top of it, we see the move printed. See Figure 17.8.
Move disk number 1 from post 3 to post 2.
Now control passes back to the call MoveDi sk I2 , 1 , 2 , 3) which, having completed its task of moving disk 2 (and all disks on top of it) from post 1 to post 2, returns to its caller. Its caller is MoveDisk (3, 1, 3, 2) . Now, all disks have been moved off disk 3 and onto post 2. Disk 3 can be moved from post 1 onto post 3. The p r i n t f is the next statement executed. See Figure 17.9.
Move disk number 3 from post 1 to post 3.
The next subtask remaining is to move disk 2 (and all disks on top of it) from post 2 onto post 3. We can use post 1 for intermediate storage. The following caU occurs on line 22 of the source code.
/* diskNumber 2; startPost 2; endPost 3; midPost 1 */ MoveDisk(2, 2, 3, 1)
In order to do so, we must first move disk 1 from post 2 onto post 1. This call is made from line 1S in the source code.
/* diskNumber l; startPost 2; endPost 1; midJ?ost 3 */ MoveDisk(l, 2, l, 3)
The move requires no submoves. See Figure 17. I0. Move disk number 1 from post 2 to post l.
————-….·-“-·· –·—-··
17.4 Towers of Hanoi 4&3
464
chapter 17 Recursion
2 13
23
Figure 17.11 The Towers of Hanoi
puzzle, after sixth move
2
Figure 17.12 The Towers of Hanoi
puzzle, completed
Return passes back to the caller MoveDisk(2, 2, 3, 1), and disk 2 is moved onto post 3. See Figure 17.11.
Move disk number 2 from post 2 to post 3.
The only thing remaining is to move all disks that were on disk 2 back on top.
/* diskNumber 1; startPost 1; endPost 3; midPost 2 */ MoveDisk(l, 1, 3, 2)
The move is done immediately. Sec Figure 17.12. Move disk number 1 from post 1 to post 3.
and the puzzle is completed!
Let’s summarize the action of the recursion by examining the sequence of
function calls that were made in solving the three-disk puzzle:
MoveDisk(3, 1, 3′ 2) /* Initial Call */ MoveDisk(2, 1, 2, 3)
MoveDisk(l, 1, 3, 2)
MoveDisk(l, 2, 3, 1)
MoveDisk(2, 2, 3, 1) MoveDisk(l, 2′ 1, 3) MoveDisk(l, 1, 3, 21
Consider how you would write an iterative version of a program to solve this puzzle and you will appreciate the simplicity o f the recursive version. Returning to the legend of the Towers of Hanoi: the world will end when the monks finish solving a 64-disk version of the puzzle. If each move takes one second, how long will it take the monks to solve the puzzle?
17.S FibonacciNumbers
The following recurrence equations generate a well-known sequence of num- bers called the Fibonacci numbers, which has some interesting mathematical, geometrical, and natural properties.
————–~—– 17.5 Fibonacci Numbers 465
f(n)=f(n-1)+f(n- 2) f(l) =1
f(O) =1
In other words, the nth Fibonacci number is the sum ofthe previous two. The series isl, I, 2, 3, 5, 8, 13, … This series was first formulated by the Italian mathemati- cian Leonardo of Pisa around the year 1200. His father’s name was Bonacci, thus he often called himself Fibonacci as a shortening of filius Bonacci, or son of Bonacci. Fibonacci formulated this series as a way of estimating breeding rabbit populations, and we have since discovered some facinating ways in which the series models some other natural phenomena such as the structure of a spiral shell or the pattern of petals on a flower.
We can formulate a recursive function to calculate the nth Fibonacci number directly from the recurrence equations. Fibonacci (n) is recursively calculated byFibonacci(n-1) + Fibonacci(n-2).Thebasecaseoftherecursionissim- ply the fact that Fibonacci (1) and Fibonacci (oI both equal I. Figure 17.13 lists the recursive code to calculate the nth Fibonacci number.
10
11
12
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
printf ( HWhich Fibonacci number? 11 ) ;
printf (11 That Fibonacci number is %d\n11 ,
int Fibonacci(int n) int sum;
if (n == 0 11 n 1) return l;
1 #include
2
3 int Fibonacci(int n); 4
5 int main()
6{
7
8
9
int in;
int number;
scanf(11%-d11 ,
&in); number= Fibonacci(in);
else {
sum= (Fibonacci(n-1) + Fibonacci(n-2)); return sum;
Figure 17.13 Fibonacci is a recursive Cfunction to calculate the nth Fibonacci number
number) ;
466
chapter 17 Recursion
We will use this example to examine how recursion works from the perspec- tive of the lower levels of the computing system. In particular, we will examine the run-time stack mechanism and how it deals with recursive calls. Whenever the function is called, whether from itself or another function, a new copy of its activation record is pushed onto the run-time stack. That is, each invocation of the function gets a new, private copy of parameters and local variables, where each copy is different than any other copy. This must be the case in order for recursion to work, and the run-time stack enables this. If the variables of this function were statically allocated in memory, each recursive call to F i b o n a c c i would overwrite
the values of the previous call.
Let’sseewhathappenswhenwecallthefunction Fibonacci withtheparam-
eter 3, Fibonacci (3). We start offwith the activation record for Fibonacci (3) on top of the run-time stack. Figure 17.14 shows the progression of the stack as the original function call is evaluated.
The function call Fibonacci (3) will calculate first Fibonacci (3-1), as the expression Fibonacci (n-1) + Fibonacci (n-2) is evaluated left to right. Therefore, a call is first made to Fibonacci (2 ) , and an activation record for Fibonacci (2) is pushed onto the run-time stack (see Figure 17.14, step 2).
For Fibonacci(2), the parameter n equals 2 and does not meet the terminal condition, therefore a call is made to Fibonacci (1) (see Figure 17.14, step 3). This call is made in the course of evaluating Fibonacci(2-1) + Fibonacci(2-2).
The call Fibonacci (1) results in no more recursive calls because the param- eter n meets the terminal condition. The value 1 is returned to Fibonacci (2), which now can complete the evaluation of Fibonacci I1) + Fibonacci Io) by calling F’ibonacci (O) (see Figure 17.14, step 4). The call Fibonacci (O) immediately returns a I.
Now, the call Pibonacci (2) can complete and return it~ subcalculation (its result is 2) to its caller, Fibonacci (3). Having completed the left-hand compo- nent of the expression Fibonacci (2) + Fibonacci (1), Fibonacci (3) calls Fibonacci (1) (see Figure 17.14, step 5), which immediately returns the value
I. Now Fibonacci (3) isdone-itsresultis3 (Figure 17.14, step6).
We could state the recursion of Fibonacci (3) algebraically, as follows:
Fibonacci(3) Fibonacci(2) + Fibonacci(l)
(Fibonacci(l) + Fibonacci(O) I + Fibonacci(l)
1+l+1=3
ThesequenceoffunctioncallsmadeduringtheevaluationsofFibonacci I3I is as follows:
Fibonacci(3) Fibonacci (2) Fibonacci(l) Fibonacci(O) Fibonacci(l)
‘
t—~ —i.,-~ R 6
,- R6 Fibonacci(2)
Fibonacci (3)
main
Step 2: Fibonacci I3I calls Fibonacci (2I
~ R6
Fibonacci (3)
main
Step 1: Initial call
Fibonacci(l)
Fibonacci(2)
Fibonacci(3)
main
~
R6
Step 3: Fibonacci (2 I calls Fibonacci I11
t—-‘—1.-R6 Fibonacci(l)
Step 4: Fibonacci (2 I calls Fibonacci Io I
~R6 Fibonacci(3) Fibonacci(3)
main main
Step 5: Fibonacci (3) calls Fibonacci (1) Step 6: Back to the starting point Figure 17.14 Snapshots ofthe run-time stack for the function calIFibonacci I3i
17.5 Fibonacci Numbers 467
Fibonacci (0)
Fibonacci (2)
Fibonacci (3)
main
468 chapter 17 Recursion
Walk through the execution of Fibonacci (4) and you will notice that the sequence of calls made by Fibonacci. (3) is a subset of the calls made by Fibonacci (4). No surprise, since Fibonacci (4) – Fibonacci (3) + Fibonacci (2). Likewise, the sequence of calls made by Fibonacci (4) is a subset of the calls made by Fibonacci (s). There is an exercise at the end of this chapter involving calculating the number of function calls made during the evaluation of Fibonacci (n).
The LC-3 C compiler generates the following code for this program, listed in Figure 17.15. Notice that no special treatment was required because this function is recursive. Because of the run-time stack mechanism for activating functions, a recursive function gets treated like every other function. If you examine this code closely, you will notice that the compiler generated a temporary variable in order to translate line 24 of Fibonacci properly. Most compilers will generate such temporaries when compiling complex expressions. Such temporary values are allocated storage in the activation on top of the space for the programmer-declared local variables.
17.6 BinarqSearch
In the introduction to this chapter, we described a recursive technique for finding a particular exam in a set of exams that are in alphabetical order. The technique is called binary search, and it is a very rapid way of finding a particular element within a list of elements in sorted order. At this point, given our understanding of recursion and of arrays, we can specify a recursive function in C to perform binary search.
Say we want to find a particular integer value in an array of integers that is in ascending order. The function should return the index of the integer, or a -1 if the integer docs not exist. To accomplish this, we will use the binary search technique as such: given an array and an integer to search for, we will examine the midpoint of the array and determine if the integer is (l) equal to the value at the midpoint, (2) less than the value at the midpoint, or (3) greater than the value at the midpoint. If it is equal, we are done. If it is less than, we perform the search again, but this time only on the first half of the array. If it is greater than, we perform the search only on the second half of the array. Notice that we can express cases (2) and (3) using recursive calls. But what happens if the value we are searching for does not exist within the array? Given this recursive technique of performing searches on smaller and smaller subarrays of the original array, we eventually perform a search on an array that has no elements (e.g., of size 0) if the item we are searching for does not exist. If we encounter this situation, we will return a -1. This will be a base case in the recursion.
Figure 17.16 contains the recursive implementation of the binary search algo- rithm in C. Notice that in order to determine the size of the array at each step, we pass the starting point and ending point of the subarray along with each call to Binarysearch. Each call refines the variables start and end to search smaller and smaller subarrays of the original array l i s t .
2
3 4 s 6 7 8 9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
ADD R6, STR R7, ADD R6, STR RS, ADD RS, ADD R6,
LDR RD,
BRZ FIB BASE ADD RO, RO, #-1 BRZ FIB BASE
store return address
push caller’s frame pointer
set new frame pointer
allocate space for locals and temps
load the parameter n
n==O
n==l
load the parameter n
1 Fibonacci:
LDR RO, R6, #0 ADD R6, R6, #-1 LDR Rl, RS, #-1 ADD RO, RO, R l BR FIB END
FIB BASE:
AND RO, RO, #0 ADD RO, RO, # 1
FIB END:
STR RO, RS, #3 ADD R6, RS, #1 LDR RS, R6, #0 ADD R6, R6, #1 LDR R7, R6, #0 ADD R6, R6, #1 RET
read the return value at top of stack pop return value
read temporary value: Fibonacci(n-1)
Fibonacci(n-1) + Fibonacci(n-2) branch to end of code
clear RO RO= 1
write the return value
pop local variables
restore caller’s frame pointer
pop return address
R6, #-2 push return value/address
R6, #0 R6, #-1 R6, #0 R6, #-1 R6, #-2
RS, #4
LDR RO, RS, #4
ADD RO, RO, #-1 calculate n-1 ADD R6, R6, #-1 push n-1
STR RO, R6, #0
JSR Fibonacci
Figure 17.15 Fibonacci in LC-3 assembly code
call to Fibonacci(n-1)
read the return value at top of stack
LDR RO, R6, #0
ADD R6, R6, #-1
STR RO, RS, #-1 store it into temporary value LDR RO, RS, #4
ADD RO, RO, #-2
ADD R6, R6, #-1
STR RO, R6, #0
JSR Fibonacci
pop return value
load the parameter n calculate n-2
push n-2
call to Fibonacci(n-2)
17.6 Binary Search 469
470
chapter 17 Recursion
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
/* Did we not find what we are looking for? */ if (end< start)
l /*
2 ** This function returns the position of 'item' if it exists
3 ** between list[start] and list[end], or -1 if it does not.
4 */
5 int BinarySearch(int item, int list[], int start, int end)
6{ 7
8
9
int middle - (end+ start) / 2;
return -1;
/* Did we find the item?*/ else if (list [middle] item)
return middle;
/* Should we search the first half of the array?*/ else if (item< list[middle])
return BinarySearch(item, list, start, middle - 1);
/* Or should we search the second half of the array?*/ else
return BinarySearch(item, list, middle+ 1, end);
Figure 17.16 A recursive C function to perform binary search
Figure 17.17 provides a pictorial representation of this code during execution. Thearray list contains 11 elementsasshown.TheinitialcalltoBinarySearch passes the value we are looking for (itern) and the array to be searched (recall from Chapter 16 that this is the address of the very first element, or base address, of the array). Along with the array, we provide the extent of the array. That is, we provide the starting point and ending point of the portion of the array to be searched. In every subsequent recursive call to B i n a r y S e a r c h , this extent is made smaller, eventually reaching a point where the subset of the array we are searching has either only one element or no elements at all. These two situations are the base cases of the recursion.
Instead o f resorting to a technique like binary search, we could have attempted a more straightforward sequential search through the array. That is, we could examine list [OJ, then list [ll, then list [2], etc., and eventually either find the item or determine that it does not exist. Binary search, however, will require fewer comparisons and can potentially execute faster if the array is large enough. In subsequent computing courses you will analyze binary search and derive that its running time is proportional to log2 n, where n is the size of the array. Sequential search, on the other hand, is proportional to n.
list
list
17.7 [nteger to ASCII 471 start middle end
B~narySearch(l09, array, 0, 10)
start middle end
110 153 387 392 777 926
BinarySearch(109, array, O, 4)
middle start end
list 12 32 37 110 153 387 392 777 926
BinarySearch(109, array, 3, 4)
middle start end
list 12 32 37 49 110 153 387 392 777 926
BinarySearch(109, array, 4, 4)
Figure 17.17 BinarySearch performed on an array of 11 elements. We are searching for
the element 109
17.7 IntegertoASCII
Our final example of a recursive function is a function that converts an arbitrary integer value into a string of ASCII characters. Recall from Chapter 10 that in order to display an integer value on the screen. each digit of the value must be individually extracted, converted into ASCII, and then displayed on the output device. In Chapter 10, we wrote an LC-3 routine to do this using a straightforward iterative technique.
We can do this recursively with the following recursive formulation: if the number to be displayed is a single digit, we convert it lo ASCII and display it and we are done (base case). If the number is multiple digits, we make a recursive
472
chapter 17 Recursion
1
7 8 9
10
11
#include
void IntToAscii(int i);
int mainI) 6{
2 3 4 5
12
13
14
15
16 17{ 18
int in;
printf (11 Input number: 11 ) ;
scanf (11 %d11
,
&in);
IntToAscii (in) ;
printf(“\n”); }
void IntToAscii(int num)
int prefix; int currDigit;
19 20 21 22 23 24 25 26 27 28 29 30
Figure 17.18 IntToAscii is a recursive function that converts a positive integer to ASCII
call on the number without the least significant (rightmost) digit, and when the recursive call returns we display the rightmost digit.
Figure 17.18 lists the recursive C function. It takes a positive integer value and converts each digit of the value into ASCII and displays the resulting characters. The recursive function I n t T o A s c i i works as follows: to print out a number, say 21,669, for example (i.e., we are making the call IntToAscii (21669) ), the function will subdivide the problem into two parts. First 2166 must be printed out via a recursive call to IntToAscii, and once the call is done, the 9 will be
printed.
The function removes the least significant digit of the parameter num by
shifting it to the right one digit by dividing by 10. With this new (and smaller) value, we make a recursive call. If the input value num is only a single digit, it is converted to ASCII and displayed to the screcn~no recursive calls necessary for this case.
Once control returns to each call, the digit that was removed is converted lo ASCII and displayed. To clarify, we present the series of calls for the original call
if (num < 10) printf(11%c11 ,
/* The terminal case
*I
num+ 'O'}; prefix= num / 10;
else { IntToAscii(prefix);
/* Convert the number */ /* without last digit */
/* Then print last digit*/
currDigit = num % 10;
printf(11 %c11 , currDigit + 1 0'};
ofintToAscii(12345):
IntT o A scii(12345) IntT oA scii(1234) IntToAscii(l23) IntT oA scii(12) IntToAscii(l) printf (' l' ) printf('2') printf('3') printf ('4') printf ('5')
17.B Summar~
In this chapter, we introduced the concept of recursion. We can solve a problem recursively by using a function that calls itself on smaller subproblems. With recursion, we state the function, say f (n), in terms of the same function on smaller values ofn, say for example, f (n - I). The Fibonacci series, for example, is recursively stated as
Fibonacci(n) = Fibonacci(n-1) + Fibonacci(n-2);
For the recursion to eventually terminate, recursive calls require a base case. Recursion is a powerful programming tool that, when applied to the right problem, can make the task of programming considerably easier. For example, the Towers of Hanoi puzzle can be solved in a simple manner with recursion. It is much harder to formulate using iteration. In future courses, you will examine ways of organizing data involving pointers (e.g., trees and graphs) where the
simplest techniques to manipulate the data structure involve recursive functions. At the lower levels, recursive functions are handled in exactly the same manner as any other function call. The run-time stack mechanism enables this by allowing us to allocate in memory an activation record for each function invocation so that it does not conflict with any other invocation's activation record.
17.1 For these questions, refer to the examples that appear in the chapter.
a. How many calls to RunningSum (see Section 17.2) are made for the callRunningSum(lO)?
b. How about for the call RunningSum (n)? Give your answer in terms ofn.
c. How many calls to MoveDisk are made in the Towers of Hanoi problem if the initial call is MoveDisk (4 , 1 , 3 , 2)? This call plays out a four-disk game.
d. How many calls are made for an n-disk game?
e. How many calls to Fibonacci (see Figure 17.13) are made for the
initial call Fibonacci (10)?
f How many calls are required for the nth Fibonacci number?
Exercises
Exercises
473
474
chapter 17 Recursion
17.2 17.3 17.4
Is the return address for a recursive function always the same at each function call? Why or why not?
What would happen ifwe swapped the p r i n t f call with the recursive call in the code for IntToAscii in Figure 17.18?
What does the following function produce for c o u n t (20)?
17.5
int count(int arg) {
if (arg < 1) return O;
else if (arg % 2)
return(l + count(arg - 2));
else
return(l + count(arg - 1));
Consider the following C program: #include
int Power(int a, int b); int main(void)
int x, y, z;
printf (11 Input two numbers: 11 ) ;
scanf(11%-d %d”
1
&x, &y);
if (x > 0 && y > 0) z – Power(x,y);
else
z:= 0;
printf(11The result is %d.\n11 ,
int Power (int a, int b)
if (a < b) return O;
else
return 1 + Power(a/b, b);
z);
R6-
?
1
t
? Activation record for Power ?
?l
? Activation record for Power 11
7l Figure l 7.19 Run-time stack after function Power is called
a. State the complete output if the input is (I) 4 9
(2) 27 5
-- -
--,-- -- ---
(3) -1 3
b. What docs the function Power compute?
c. Figure 17.19 is a snapshot of the stack after a call to the function
Power. Two activation records are shown, with some of the entries filled in. Assume the snapshot was taken just before execution of one of the return statements in Power. What are the values in the entries marked with a question mark? lf an entry contains an address, use an arrow to indicate the location the address
refers to.
Exercises 475
476
chapter 17 Recursion
17.6
Consider the following C function:
int Sigma( int k)
{
}
a. Convert the recursive function into a nonrecursive function. Assume sigma () will always be called with a nonnegative argument.
b. Exactly I KB of contiguous memory is available for the run-time stack, and addresses and integers are 16 bits wide. How many recursive function calls can be made before the program runs out of memory? Assume no storage is needed for temporary values.
The following C program is compiled and executed on the LC-3. When the program is executed, the run-time stack starts at memory location
xFEFF and grows toward xCOOO (the stack can occupy up to 16 KBytes of memory).
SevenUpIint x)
{
if Ix == 1)
return 7; else
17.7
int l;
1 =k -1;
if (k==O) return o;
else
return (k + Sigma(l));
}
int main()
{
return (7 + sevenUp(x - 1));
int a;
printf("Input a number
\n 11
) ;
scanf(11%d11 , &a);
a = SevenUp (a) ;
printf( 11 %-d is 7 times the number\n'1 , a);
a. What is the largest input value for which this program will run correctly? Explain your answer.
b. If the run-time stack can occupy only 4 KBytes of memory, what is the largest input value for which this program will run correctly? Explain your answer.
Exercises 4TT
17.8 Write an iterative version of a function to find the nth Fibonacci number. Plot the running time of this iterative version to the running time of the recursive version on a variety of values for n. Why is the recursive version significantly slower when n is sufficiently large?
17.9 The binary search routine shown in Figure 17.16 searches through an array that is in ascending order. Rewrite the code so that it works for arrays in descending order.
17.10 Following is a very famous algorithm whose recursive version is significantly easier to express than the iterative one. For the following subproblems, provide the final value returned by the function.
int ea(int x, int y) {
int a;
if (y =~ oI return x;
else {
a""x %y;
return (ea(y, a));
a. ea(l2, 15) b. ea(6, 10)
c. ea(llO, 24)
d. What does this function calculate? Consider how you might
construct an iterative version to calculate the same thing.
17.11 Write a program without recursive functions equivalent to the following C program.
int main() {
p r i n t f ( ' ' % d 11
void M()
{
,
M());
}
int num, x;
printf(11Type a number: 11 ) ;
scanf (n%d11
if (num <~ O)
return O; else {
X =M();
if lnum > x)
return num; else
return x;
,
&num);
478
chapter 17 Recursion
17.12
Consider the following recursive function:
int func (int arg) {
if (arg % 2 l= 0)
return func(arg – 1);
if (arg <= 0) return 1;
return func(arg/2) + l;
a. Is there a value of arg that causes an infinite recursion? If so, what is it?
b. Suppose that the function func is part of a program whose main function follows. How many function calls are made to func when the program is executed?
17.13
The following function is a recursive function that takes a string of characters of unknown length and detennines if it contains balanced parentheses. The function Balanced is designed to match parentheses. It returns a Oif the parentheses in the character array string are balanced and a nonzero value if the parentheses are not balanced. The initial call to Balanced would be: Balanced Istring, o, o) ;
The function Balanced that follows, however, is missing a few key pieces of code. Fill in the three underlined missing portions in the code.
int Balanced(char string[], int position, int count) {
if (________) return count;
else if (string[position]
return Balanced( string, ++position, ++count};
else if (string[position] == _ _ _ _ _
return Balanced( string, ++position, --count);
else
return Balanced( string, ++position, count);
int main() {
printf( 11 The value is %d\n11 ,
c. What value is output by the program?
func(lO));
17.14 What is the output of the following C program? #include
void Magic(int in); int Even (int n) ;
int main() Magic(lO);
}
void Magic (int in)
{
if (in ~~ 0) return;
if (Even (in)) printf( 11 %i\n11
Magic(in – 1); if (!Even(in))
,
in);
printf( 1’%i\n1′, in); return;
int Even (int n)
/* even, return 1; odd, return O */ return (n % 2) =~ O 7 1 : O;
Exercises 479
1/0 in C 18.1 Introduction
Whether it be to the screen, to a file, or to another computer across a network, all useful programs perform output of some sort or another. Most programs also require some form of input. As is the case with many other modern programming languages, input and output are not directly supported by C. Instead input/output (1/0) is handled by a set o f standard library functions that extend the base language. The behavior of these standard library functions is precisely defined by the ANSI C standard.
In this chapter, we will discuss several functions in the C standard library that support simple 1/0. The functions putchar and printf write to the output device and getchar and scanf read from the input device. The more general functions fprintf and fscanf perform file 1/0, such as to a file on disk. We have used printf and scanf extensively throughout the second half of this book. In this chapter, we examine the details of how these functions work. Along the way, we will introduce the notion of variable argument lists and demonstrate how parameter-passing on the LC-3 run-time stack handles function calls with a variable number of arguments.
18.2 TheCStandardLibrar~
The C standard library is a major extension of the C programming language. It provides support for input/ouput, character string manipulations, mathemat- ical functions, file access functions, and various system utilities that are not
chapter
18
482 chapter 18 1/0 in C
specifically required for a single program but are generally useful in many programs. The standard library is intended to be a repository of useful, prim- itive functions that serve as components for building complex software. This component-based library approach is a characteristic of many programming lan- guages: C++ and Java also have similar standard libraries of primitive functions. We provide a short description of some useful C library functions in Appendix D.9. The library’s functions are typically written by designers of the compiler and oper- ating system, and on many occasions they are optimized for the system on which they arc installed.
To use a function defined within the C standard library, we must include the appropriate header file (. h file). The functions within the standard library are grouped according to their functionality. Each of these groups has a header file associated with it. For example, mathematical functions such as s i n and t a n usethecommonheaderfilemath. h. Thestandard1/0functionsusetheheaderfile s t d i o . h. These header files contain, among other things, function declarations for the 1/0 functions and preprocessor macros relating to 1/0. A library header file does not contain the source code for library functions.
If the header files do not contain source code, how docs the machine code for, say, p r i n t f get added to our programs? Each library function called within a program is linked in when the executable image is formed. The object files containing the library functions are stored somewhere on the system and arc accessed by the linker, which links together the various function binaries into a single executable program.
As an aside, programs can be linked dynamically. With certain types of libraries (dynamically linked libraries [DLLs] or shared libraries), the machine code for a library routine does not appear within the executable image but is “linked” on demand, while the program executes.
18.3 1/0.OneCharacterataTime
We’II start by examining two of the simplest 1/0 functions provided by the C library. The functions g e t c h a r and p u t c h a r perform input and output on a single character at a time. Input is read in as ASCII and output is written out as ASCII, in a manner similar to the IN and OUT TRAP routines of the LC-3.
18.3.1 1/0 Streams
Conceptually, all character-based input and output is performed on streams. The sequence of ASCII characters typed by the user at the keyboard is an example of an input stream. As each character is typed, it is added to the end of the stream. Whenever a program reads keyboard input, it reads from the beginning of the stream. The sequence of ASCII characters printed by a program, similarly, is added to the end of the output stream. In other words, this stream abstraction allows us to further decouple the producer from the consumer, which is helpful because the two are usually operating at different rates (see Chapter 8). For example, if a program wants to perform some output, it adds characters to the end of the output
stream without being required to wait for the output device to finish displaying the previous character. Many other popular languages such as C++ provide a similar stream-based abstraction for 1/0.
In C the standard input stream is referred to as stdin and is mapped to the keyboard by default. The standard output stream is referred to as s t d o u t and is mapped by default to the display. The functions getchar and putchar operate on these two streams.
18.3.2 putchar
The function putchar is the high-level language equivalent of the LC-3 OUT TRAP routine. The function putchar displays on the stdout output stream the ASCII value of the parameter passed to it. It performs no type conversions-the value passed to it is assumed to be ASCII and is added directly to the output stream. All the calls to p u t c h a r in the following code segment cause the same character (lowercase h) to be displayed. A putchar function call is treated like any other function call, except here the function resides within the standard library. The function declaration for putchar appears in the stdio. h header file. Its code will be linked into the executable during the compiler’s link phase.
char c ::: ‘h’;
putchar(c);
putchar(‘h’);
putchar(l04);
18.3.3 getchar
The function g e t c h a r is the high-level language equivalent of the LC-3 IN TRAP function. It returns the ASCII value of the next input character appearing in the s t d i n input stream. By default, the s t d i n input stream is simply the stream of characters typed at the keyboard. In the following code segment, g e t c h a r returns the ASCII value of the next character typed at the keyboard. This return value is
assigned to the variable c. char c;
c = getchar();
18.3.4 Buffered 1/0
Run the C code in Figure 18.1 and you will notice something peculiar. The program prompts the user for the first input character and waits for that input to be typed in. Type in a single character (say z, for example) and nothing happens. The second prompt does not appear, as if the call to getchar has missed the keystroke. In
18.3 1/0, One Character at a Time 483
484
chapter 18 l/0 in C
1 #include
2
3 int main()
4{
5 char inCharl;
6 char inChar2;
7
8 printf (11 Input character 1: \n 11 ) ;
9 inCharl – getchar();
10
11 printf (11 Input character 2: \n”} ;
12 inchar2 – getchar();
13
14 printf (11 Character 1 is %c\n11 , inCharl);
15 printf (11 Character 2 is %c\n”, inChar2);
16
Figure 18.1 An example of buffered input
fact, the program seems to make no progress at all until the Enter key is pressed. Such behavior seems unexpected considering that getchar is specified to read only a single character from the keyboard input stream.
This unexpected behavior is due to buffering of the keyboard input stream. On most computer systems, I/0 streams are buffered. Every key typed on the keyboard is captured by the low-level operating system software and kept in a buffer, which is a small array, until it is released into the input stream. In the case of the input stream, the buffer is released when the user presses Enter. The Enter key itself appears as a newline character in the input stream. So in the example in Figure 18.1, if the user types the character A and presses Enter, the variable inCharl will equal the ASCII value ofA (which is 65) and the variable inChar2 will equal the ASCII value of newline (which is 10).
There is a good reason for buffering, particularly for keyboard input: Pressing the Enter key allows the user to confinn the input. Say you mistyped some input and wanted to correct it before the program detects it. You can edit what you type using the backspace and delete keys, and then confirm your input by pressing Enter.
The output stream is similarly buffered. Observe by running the program in Figure 18.2.
This program uses a new library function called sleep that suspends the execution of the program for approximately the number of seconds provided as the integer argument, which in this case is 5. This library function requires that we include the u n i s t d . h header file. Run this code and you will notice that the output of the character a does not happen quite as you might expect. Instead of appearing prior to the five-second delay, the character a appears afterwards, only after the newline character releases the output buffer to the output stream. We say that the putchar ( ‘ \n’ ) causes output to beflushed. Add a putchar ( ‘ \n’ ) statement immediately after line 6 and the program will behave differently.
Despite the slightly complex behavior of buffered I/0 streams, the underlying mechanism used to make this happen are the IN and OUT TRAP routines described
1 #include
2 #include
3
4 int main()
5{
6 putchar (‘ a’ ) ; 7
8 sleep(5);
9
10 putchar(‘b’); 11 putchar(‘\n’); 12 }
Figure 18.2 An example of buffered output
in Chapter 8. The buffering of streams is accomplished by extra layers of software ·surrounding the IN and OUT service routines.
18.4 Formatted1/0
The functions p u t c h a r and g e t c h a r suffice for simple VO tasks but are cumber- some for performing non-ASCII VO. The functions p r i n t f and s c a n f perform more sophisticated formatted I/0, and they are designed to more conveniently handle 1/0 of integer and floating point values.
18.4.1 printf
The function printf writes formatted text to the output stream. Using printf, we can print out ASCII text embedded with values generated by the running program. The printf function takes care of all the type conversions neces- sary for this to occur. For example, the following code prints out the value of integer variable x. In doing so, the p r i n t f must convert the integer value of x into a sequence of ASCII characters that can be embedded in the output stream.
int X;
printf(“The value is %d\n11 , x);
Generally speaking, p r i n t f writes its first parameter to the output stream. The first parameter is the fonnat string. It is a character string (i.e., of type char*) containing text to be displayed on the output device. Embedded within the format string are zero or more conversion specifications.
The conversion specifications indicate how to print out any of the parameters that follow the format string in the function call. Conversion specifications all begin with a %character. As their name implies, they indicate how the values of the parameters that follow the format string should be treated when converted to ASCII. In many of the examples we have encountered so far, integers have
18.4 Formatted I/0 485
486 chapter 18 1/0 in C
been printed out as decimal numbers using the %d specification. We could also use the %x specification to print integers as hexadecimal numbers, or %b to print them as binary numbers (represented as ASCII text, of course). Other conversions include: %c causes a value to be interpreted as straight ASCII, the %s specification is used for strings and causes characters stored consecutively in memory to be output (for this the corresponding parameter is expected to be of type char*). The specification %f interprets the corresponding parameter as a floating point number and displays it in a floating point forrnat. What if we wanted to print out the %character itself/ We use the sequence %%. See Appendix D for a full listing of conversion specifiers.
As mentioned in Chapter 11, special characters such as newline can also be embedded in the format string. The \n prints a new line and a \ t character prints a tab; both are examples of these special characters. All special characters begin with a \ and they can appear anywhere within a forrnat string. In order to print out a backslash character, we use a \ \. See Table D. l in the appendix for a list of special characters.
Here are some examples of various forrnat specifications:
int a 102;
int b 65;
char c I z’;
char banner [10] = 11 Hola ! 11 double pi~ 3.14159;
;
printf( 11 The variable ‘a’ decimal : %d\n11
printf (11 The variable I a’ hex : %-x\n11 , a};
printf( 11 The variable ‘a’ binary : %-b\n”, a);
printf(11 ‘a’ plus ‘b’ as character : %c\n11 , a+ b); printf(“Char %c.\t String %s\n Float %f\n”, c, banner, pi);
The function p r i n t f begins by examining the format string a single character at a time. If the current character is not a %or \, then the character is directly written to the output stream. (Recall that the output stream is buffered so the output might not appear on the display until a new line is written.) I f the character is a \, then the next character indicates the particular special character to print out. For instance, the escape sequence \n indicates a newline character. If the current character is a %, indicating a conversion specification, then the next character indicates how the next pending parameter should be interpreted. For instance, if the conversion specification is a %d and the next pending parameter is the bit pattern 0000000001101000, then the number 104 is written to the output stream. I f t h e c o n v e r s i o n c h a r a c t e r i s a %c, t h e n t h e c h a r a c t e r h i s w r i t t e n . A d i f f e r e n t v a l u e is printed if %f is the conversion specification. The conversion specifier indicates toprintf howthenextparametershouldbeinterpreted.Itisimportanttorealize that, within the printf routine, there is no relationship between a conversion specification and the type of a parameter. The programmer is free to choose how
,
a);
18.4 Formatted 1/0 487 things are to be interpreted as they are displayed to the screen. Question: What
happens with the following function call?
printf( 11 The value of nothing is %d\n11 ) ;
There is no argument corresponding to the %d specification. When the p r i n t f routine is called, it assumes the correct number of values were written onto the stack, so it blindly reads a value off the stack for the %d spec, assuming it was intentionally placed there by the caller. Here, a garbage value is displayed to the screen. However, it is displayed in decimal.
18.4.2 scanf
The function s c a n f is used to read formatted ASCII data from the input stream. A calltoscanf issimilartoacalltoprintf.Bothcallsrequireaformatstringasthe first argument followed by a variable number of other argwnents. Both functions are controlled by characters within the format string. The function s c a n f differs in that all arguments following the format string must be pointers. As we discussed in Chapter 16, scanf must be able to access the original locations of the objects in memory in order to assign new values to them.
The format string for scanf contains ASCII text and conversion specifi- cations, just like the format string for printf. The conversion characters are similar to those used for printf. A table of these specifications can be found in Appendix D. Essentially, the format string represents the format of the input stream. For example, the format string “%d” indicates to s c a n f that the next sequence of non-white space characters (white space is defined as spaces, tabs, new lines, carriage returns, vertical tabs, and form feeds) is a sequence of digits in ASCII representing an integer in decimal notation. After this decimal num- ber is read from the input stream, it is converted into an integer and stored in the corresponding argument. Since scanf modifies the values of the variables passed to it, arguments are passed by reference using the & operator. In addi- tion to conversion specifications, the format string also can contain plain text, which s c a n f tries to match with the input stream. We use the following code to demonstrate.
char name[lOO];
int month, day, year; double gpa;
printf(“Enter : lastname birthdate grade_point_average\n11 ) ; scanf (“%s %d/%d/%d %lf 11 , name, &month, &day, &year, &gpa};
printf(11 \n11 );
printf {11 Name : %s\n”, name);
printf(“Birthday : %d/%d/%d\n”, month, day, year);
printf( 11 GPA: %f\n11
,
gpa);
488
chapter 18 1/0 in C
In this scanf statement, the first specification is a %s that scans a string of characters from the input stream. In this context, all characters starting from the first non-white space character and ending with the next white space character (conceptually, the next word in the input stream) are stored in memory starting at the address of name. An \ o character is automatically added to signify the end of the string. Since the argument name is an array, it is automatically passed by reference, that is, the address of the first element of the array is passed to scant.
The next specification is for a decimal number, %d. Now, s c a n f expects to find a sequence of digits (at least one digit) as the next set of non-white space characters in the standard input stream. Characters from standard input are analyzed white space characters are discarded, and the decimal number (i.e., a sequence of digits terminated by a nondigit) is read in. The number is converted from a sequence of ASCII characters into a binary integer and stored in the memory location indicated by the argument &month.
The next input field is the ASCII character /. Now, scanf expects to find this character, possibly surrounded by white space, in the input stream. Since this input field is not a conversion specification, it is not assigned to any variable. Once it is read in from the input stream, it is discarded, and scanf moves onto the next field of the format string. Similarly, the next three input fields %d/ %d read in two decimal numbers separated by a /. These values are converted into integers and are assigned to the locations indicated by the pointers appearing as the next two arguments (which correspond to the addresses of the variables day and year).
The last field in the format string specifies that the input stream contains a Ion[? floating point number, which is the specification used to read in a value of type double. For this specifier, scanf expects to see a sequence of decimal numbers, and possibly a decimal point, possibly an E or e signifying exponential notation, in the input stream (see Appendix D.2.4). This field is terminated once a nondigit (excluding the first E, or the decimal point or a plus or minus sign for the fraction or exponent) or white space is detected. The s c a n f routine takes this sequence of ASCII characters and converts them into a properly expressed, double-precision floating point number and stores it into gpa.
Once it is done processing the format string, scanf returns to the caller. It also returns an integer value. The number of format specifications that were successfully scanned in the input stream is passed back to the caller. In this case, if everything went correctly, scanf would return the value 5. In the preceding code example, we chose to ignore the return value.
So, for example, the following line of input yields the following output:
Enter : lastname birthdate grade_point average Mudd 02/16/69 3.02
Name : Mudd Birthday : 2/16/69 GPA : 3.02
Since scanf ignores white space for this fonnat string, the following input stream yields the same results. Remember, newline characters are considered white space.
Enter
Mudd
I
16 / 69
lastname. birthdate grade_point_average 02
3.02
Name : Mudd Birthday : 2/16/69 GPA : 3.02
What if the fonnat of the input stream does not match the fonnat string? For instance, what happens with the following stream?
Enter : lastname birthdate grade_point_average Mudd 02 16 69 3.02
Here, the input stream does not contain the / characters encoded in the fonnat string. In this case, scanf returns the value 2, since the variables name and month are correctly assigned before the mismatch between the fonnat string and the input stream is detected. The remaining variables go unmodified. Since the input stream is buffered, unused input is not discarded, and subsequent reads of the input stream begin where the last call left off.
If the next two reads of the input stream are
a getchar(); b getchar (I ;
what do a and b contain? The answer ‘ ‘ (the space character) and 1 should be no surpnse.
18.4.3 Variable Argument Lists
By now, you might have noticed something different about the functions p r i n t f and s c a n f from all other functions we have described thus far. The two functions have a variable number of arguments passed to them. The number of arguments passed to printf and scanf depends on the number of items being printed or scanned. We say such functions have variable argument lists.
There is a one-to-one correspondence between each conversion specification in the fonnat string and each argument that appears after the fonnat string in such function calls. The following p r i n t f statement is from a previous example:
printf( 11 Char %c.\t String %s\n Float %f\n11 , c, banner, pi);
The format string contains three fonnat specifications; therefore, three argu- ments follow it in the function call. The %c spec in the string is associated with the first argument that follows (the variable c). The %sis associated with b a n n e r , and
18.4 Formatted l/0 489
490
chapter 18 1/0 in c
X 0000
printf (11 %d %d %d\n”, x, y, z);
tt
R6 —- ptr to format string X
R6 —- z Parameters for y
printf X
X FFFF
y z
(a)
Activation record for previous function
ptr to format string
(b)
Figure 18.3 Subfigure (a) shows the stack if the arguments to the p r i n t f call are pushed from right to left. Subfigure (b) shows the stack if the arguments are pushed left to right.
%f with pi. There are three values to be printed; therefore, this call contains four arguments altogether. If we want to print five values, the function call contains six arguments.
Recall from Chapter 14 that our LC-3 calling convention pushed items onto the run-time stack from right to left of the order in which they appear on the function call. This places the pointer to the format string immediately at the top of the stack when p r i n t f or s c a n f takes over. Since it is the leftmost argument, it will always be the last item pushed onto the stack before the function call occurs. Once printf or scanf takes over, they can access the first parameter directly off the top of the stack. Once this parameter (which is the format string) is analyzed, the functions can determine the other parameters on the stack. I f the arguments on a function call were pushed from left to right, it would be much more difficult for printf and scanf to discern the location ofthe format string parameter. Figure 18.3 shows two diagrams o~the run-time stack. In diagram (a), the arguments to the call for p r i n t f are passed from right to left and in (b) from left to right. Consider for which case the resulting LC-3 code for p r i n t f will be simpler. In version (a), the offset of the format string from the stack pointer will always be zero, regardless of the number of other parameters on the stack.
In version (b), the offset of the formal string from the stack pointer depends on the number of parameters on the stack.
The format string, like all other strings embedded within a program’s source code, is stored in a special region of memory reserved for constants, or literal values.
18.S 1/0fromFiles
Say we wanted to process a large set of data, such as the daily closing price of IBM stock for the last 20 years. To ask the user to type this via keyboard would render it very “user-unfriendly.” Instead, we would want the program to read the data off a file on disk, and possibly write its output to disk. I/0 in C is based on streams, as we described earlier, and these streams arc conceptually all bound to files.
Thatis,thefunctionsprintfandscanf areinactualityspecialcasesofmore general-purpose C I/0 functions. These two functions operate specifically on two special files called stdin and stdout. In C, stdin and stdout are mapped by default to the keyboard and the display.
The general-purpose version ofprintf is called fprintf, and the general- purpose versionofscanf iscalled fscanf. Thefunctions fprintf and fscanf work like their counterparts, with the main difference being that they allow us to specify the stream on which they act. For example, we can inform fprintf to write its output to a specific file on disk. Let’s examine how this can be accomplished.
Before we can perform file I/0, we need to declare a file pointer for each file we want to manipulate. Typically, files are stored on the file system of the computer system. In C, we can declare a file pointer called i n f i l e as follows:
FILE *infile;
Here we are declaring a pointer to something of type FILE. The type FILE is defined within the header file s t d i o . h. Its details are not important for our discussion.
Once the file pointer is declared, we need to map it to a file on the computer’s file system. The C library call fopen performs this mapping. Each fopen call requires two arguments: the name of the file to open and the description of what type of operation the we want to perform on the file. To follow is an example.
FILE *infile;
infile = fopen( 11 ibm_stock_prices”
1
11 r 11 ) ;
The first argument to fopen is the string ibm_stock_prices, which is the name ofthe file to open. The second argument is the operation we want to perform on this file. Several useful modes are” r” for reading, “w” for writing (a file opened with this mode will lose its previous contents), “a” for appending (here, previous contents are not lost; new data is added to the end of the file), “r+” for reading and writing. Note that both arguments must be character strings; therefore, they are surrounded by double quotes in this example. In this case, we are opening the file called “ibm_stock_prices” for reading.
I f the f o p e n call is successful, the function returns a file pointer to the physical file. If the open for some reason fails (such as the file could not be found), then the function returns a null pointer. Recall that a null pointer is an invalid pointer that has the value NULL. It is always good practice to check if the £open call was successful.
18.5 1/0 from Files 491
492
chapter 18 l/0 in C
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
#include
int main()
FILE *infile;
FILE *outfile;
double prices[LIMIT]; char answer[l0]; inti=o;
infile fopen( 11 ibm_stock_prices”, outfile fopen(“buy_hold_or_sell”,
11 r”); 11 w”);
33 } 34 }
/* Process the
data … */
FILE *infile;
infile = fopen(“ibm_stock prices”, nr 11 ) ;
if (infile == NULL)
printf(!rfopen unsuccessful!\n”);
Now with the file pointer properly mapped to a physical file, we can use fscanf and fprintf to read and write itjust as we used printf and scanf to read the standard devices. The functions fscanf and fprintf both require a file pointer as their first argument to indicate on which stream the operations are to be performed. The example in Figure 18.4 demonstrates.
Here, wc are reading from an ASCII text file called ibm stock prices and —
writing to a file called b u y _ h o l d _ o r _ s e l l . The input file contains a floating point
if (infile != NULL && outfile != NULL) ( /* Read the input data */
while ((fscanf(infile, “%-lf”, &prices[i]) !=EOF) && i < LIMIT) i++;
printf(11 %d prices read from the data file", i);
/* Write the output*/ fprintf{outfile, 11 %s 11 ,
}
else {
answer);
pril'.\tf ("fopen unsuccessful! \n 11
);
Figure 18.4 An example of a program that performs file 1/0
data item separated by white space. Even though the file can contain more, at most 10,000 items are read by this program using fscanf. The fscanf function returns a special value when no more data can be read from the input file, indicating the end of file has been reached. We can check the return value off scanf against this special character, which is defined to the preprocessor macro EOF. The condition on the while loop causes it to terminate if EOF is encountered or if the limit of input values is exceeded. After reading the input file, the program processes the input data, and the output file is written with the value of the string answer.
The function printf is equivalent to calling fprintf using stdout as the file pointer. Likewise, scanf is equivalent to calling fscanf using stdin.
18.6 Summar~
In this chapter, we examined the C facilities for performing input and output. Like many other current programming languages, C provides no direct support for input and output. Rather, standard library functions are provided for 1/0. At their core, these functions perform 1/0 one character at a time using the IN and OUT routines supported by the underlying machine.
The key concepts that you should take away from this chapter are:
• Input and output on streams. Modern programming languages create a useful abstraction for thinking about T/0. Input and output occur on streams. The producer adds data to the stream, and the consumer reads data from the stream. With this relationship, both can operate at their own rate without waiting for the other to be ready to conduct the 1/0. For example, a program generating output for the display writes data into the output stream without necessarily waiting for the display to keep pace.
• The four basic 1/0 functions. We discuss the operation, at a fairly detailed level, of four basic 1/0 functions: putchar, getchar, printf, and scanf. The latter two functions require the use of variable argument lists, which our LC-3 calling convention can easily handle because of the order in which we push arguments onto the run-time stack.
• File 1/0. C treats all 1/0 streams as file 1/0. Functions like printf and scanf are special cases where the 1/0 files are the standard output and input devices. The more general functions fprintf and fscanf enable us to specify a file pointer to which the corresponding operations are to be performed. We can bind a file pointer to a physical file on the file system using fopen.
18.6 Summary 493
494
chapter 18 1/0 in C
18.1 Write an 1/0 function call to handle the following tasks. All can be handled by a single call.
a.
b.
c.
d.
e.
Print out an integer followed by a string followed by a floating point number.
Print out a phone number in (XXX)-XXX-XXXX format. Internally, the phone number is stored as three integers.
Print out a student ID number in XXX-XX-XXXX format. Internally, the ID number is stored as three character strings.
Read a student ID number in XXX-XX-XXXX format. The number is to be stored internally as three integers. ReadinalineofinputcontainingLast name, First name, Middle initial age sex.Thenamefieldsareseparatedby commas. The middle initial and sex should be stored as characters.
Age is an integer.
18.2 What does the value returned by s c a n f represent?
18.3 Why is buffering of the keyboard input stream useful?
18.4 What must happen when a program tries to read from the input stream but the stream is empty?
18.5 Why does the following code print out a strange value (such as 1073741824)?
float x = 192.27163;
printf("The value of x is %d\n", x);
18.6 What is the value of input for the following function ca11:
scanf (11 %d11 if the input stream contains
,
&input) ;
This is not the input you are looking for.
18.7
Consider the following program: #include
int main()
{
18.8 18.9
a. What gets printed out if the input stream is 46 29 BlueMoon?
b. What gets printed out if the input stream is 4 6 B1ueMoon?
c. Whatgetsprintedoutiftheinputstreamis111 999 888?
Write a program to read in a C source file and write it back to a file called “condensed_program” with all white space removed.
Write a program to read in a text file and provide a count of
a. The number of strings in the file, where a string begins with a non-white space character and ends with a white space character.
b. The number of words in the file, where a word begins with an alphabetic character (e.g., a-z or A-Z) and ends with a nonalphabetic character.
c. The number of unique words in the file. Words are as defined in Part b. The set of unique words has no duplicates.
d. The frequency of words in order of most frequent to least frequent. In other words, analyze the text file, count the number of times each word occurs, and display these counts from most frequent word to least frequent.
int x = O;
int y = O;
char label[lO];
scanf (11 %d %d11
scanf(11%s11 ,
printf(11 %d %d %s\n11
&x, &y); label);
,
,
x, y, label);
I
Exercises 495
Data Structures
19.1 Introduction
C, al its core, provides support for three fundamental types of data: integers, characters, and floating point values. 1 That is, C natively supports the allocation of variables of these types and natively supports operators that manipulate these types, such as + for addition and* for multiplication. As we traversed the topics in the second half of this textbook, we saw the need for extending these basic types to include pointers and arrays. Both pointers and arrays are derived from the three fundamental types. Pointers point to one of the three types; we can declare arrays of int, char, or double.
Ultimately, though, the job of the programmer is to write programs that deal with real-world objects, such as an aircraft wing or a group of people or a pod of migrating whales. The problem lies in the reality that integers, characters, and floating point values are the only things that the underlying computing system can deal with. The programmer must map these real-world objects onto these primitive types, which can be burdensome. But the programming language can assist in making that bridge. Providing support for describing real-world objects and specifying operations upon them is the basis for object orientation.
Orienting a program around the objects that it manipulates rather than the primitive types that the hardware supports is the basic precept of object-oriented programming. We take a small step toward object orientation in this chapter by examining how a C programmer can build a type that is a combination of the
1 Enumerations are another fundamental type that are closely tied to integer types.
c h ap t e r
19
498
chapter 19 Data Structures
more basic types. This aggregation is called a structure in C. Structures pro- vide the programmer with a convenient way of representing objects that are best represented by multiple values. For example, an employee might be rep- resented as a structure containing a name (character string), job title (character string), department (perhaps integer), and employee ID (integer) within a corpo- rate database program. In devising such a database program we might use a C structure.
The main theme of this chapter is C’s support for advanced data structures. First, we examine how to create structures in C and examine a simple program that manipulates an array of structures. Second, we examine dynamic memory allocation in C. Dynamic allocation is not directly related to the concept of struc- tures, but it is a component we use for the third item of this chapter, linked lists. A linked list is a fundamental (and common) data organization that is similar to an array-both store collections of data items-but has a different organization for its data items. We will look at functions for adding, deleting, and searching for data items within linked lists.
19.2 Structures
Some things are best described by an aggregation of fundamental types. For such objects, C provides the concept of structures. Structures allow the pro- grammer to define a new type that consists of a combination of fundamental data items such as int, char, and double, as well as pointers to them and arrays of them. Structure variables are declared in the same way variables of fundamental data types are declared. Before any structure variables are declared, however, the organization and naming of the data items within the structure must be defined.
For example, in representing an airborne aircraft, say for a flight simulator or for a program that manages air traffic over Chicago, we would want to describe several flight characteristics that are relevant for the application at hand. The air- craft’s flight number is useful for identification, and since this would typically be a sequence of digits and characters, we could use a character string for represent- ing it. The altitude, longitude, latitude, and heading of the flight are also useful, all of which we might store as integers. Airspeed is another characteristic that would be important, and it is best represented as a double-precision floating point number. Following are the variable declarations for describing a single aircraft in flight.
char flightNum[7]; I* Max 6 characters */
int altitude; int longitude; int latitude; int headingi double airspeed;
I* in meters */ I* in tenths of degrees */ /* in tenths of degrees */ I* in tenths of degrees */ /* in kilometers/hour
*/
Tf the program modeled multiple flights, we would need to declare a copy of these variables for each one, which is tedious and could result in excessively long code. C provides a convenient way to aggregate these characteristics into a single type via the s t r u c t construct, as follows:
struct flightType { char fl ightNum [7J; int altitude;
int longitude;
int latitudei int heading; double airspeed;
);
Tn the preceding declaration, we have created a new type containing six member elements. We have not yet declared any storage; rather we have indicated to the compiler the composition of this new type. We have given the structure the tag f l i g h t T y p e , which is necessary for referring to the structure in other parts of the code.
To declare a variable of this new type, we do the following: struct flightType plane;
Thisdeclaresavariablecalledplane thatconsistsofthesixfieldsdefinedinthe structure declaration but otherwise gets treated like any other variable.
We can access the individual members of this structure variable using the following syntax:
struct flightType plane;
plane.airspeed 800.00; plane.altitude 10000;
Each member can be accessed using the variable’s name as the base name followed by a dot . followed by the member name.
The variable declaration plane gets allocated onto the stack if it is a local variable and occupies a contiguous region of memory large enough to hold all member elements. Tn this case, if each of the fundamental types occupies one LC-3 memory location, the variable p l a n e would occupy 12 locations.
The allocation of the structure is straightforward. A structure is allocated the same way a variable of a basic data type is allocated: locals (by default) are allocated on the run-time stack, and globals are allocated in the global data section. Figure 19.1 shows a portion of the run-time stack when a function that contains the following declarations is invoked.
int X;
struct airplaneType plane; int y;
/* Max 6 characters */
/* in meters
/* in tenths of degrees */ /* in tenths of degrees */ /* in tenths of degrees */ /* in kilometers/hour
*/
*I
19.2 Structures
499
500 chapter 19
Data Structures
Figure 19.1
Run-time stack t
y plane.flightNum[O]
plane.flightNum[l] plane.flightNum[2] plane.flightNum[3] plane.flightNum[4] plane.flightNum[S] plane.flightNum[6] plane.altitude plane. longitude plane.latitude plane.heading plane.airspeed
X
The run-time stack showing an allocation of a variable of structure type
Generically, the syntax for a structure declaration is as follows:
struct tag { typel memberl; type2 mernber2;
typeN memberN identifiers;
The t a g provides a handle for referring to the structure later in the code, as in the case of later declaring variables of the structure’s format. The list of members defines the organization of a structure and is syntactically a list of declarations. A member can be of any type, including another structure type. Finally, we can optionally include identifiers in a structure’s declaration to actually declare vari- ables of that structure’s type. These appear after the closing brace of the structure declaration, prior to the semicolon.
19.2.1 typedef
C structures enable programmers to define their own types. C typedef allows programmers to name their own types. It has the general form
typedef type name;
19.2 Structures 501 This statement causes the identifier name to be synonymous with the type type,
which can be any basic type or aggregate type (e.g., a structure). So for instance,
typedef int Color;
allows us to define variables of type Color, which will now be synonymous with integer. Using this definition, we can declare (for a bitmapped image, for example):
Color pixels[500];
The t y p e d e f declaration is particularly useful when dealing with structures. For example, we can create a name for the structure we defined earlier:
struct flightType { char fl ightNum [7J , int altitude;
int longitude;
int latitude; int heading; double airspeed;
/* Max 6 characters */ I* in meters */ /* in tenths of degrees */ /* in tenths of degrees */ /* in tenths of degrees */ /* in kilometers/hour */
};
typedef struct flightType Flight;
Now we can declare variables of this type by using the type name F l i g h t . For example,
Flight plane;
isnowequivalenttothedeclarationstruct flightType plane; thatweused previously.
The typedef declaration provides no additional functionality. However, it gives clarity to code, particularly code heavy with programmer-defined types. Well-chosen type names connote properties of the variables they declare .even beyond what can be expressed by the names of the variables themselves.
19.2.2 Implementing Structures in C
Now that we have seen the technique for declaring and allocating variables of structure type (and have given them new type names), we focus on accessing the member fields and performing operations on them. For example, in the following code,thememberaltitude ofthestructurevariableoftypeFlight isaccessed.
502
chapter 19 Data Structures
int x;
Flight plane; int y;
plane.altitude= O;
Here, the variable plane is of type Flight, meaning it contains the six member fields we defined previously. The member field labeled altitude is accessed using the variable’s name followed by a period, followed by the member field label. The compiler, knowing the layout of the structure, generates code that accesses the structure’s member field using the appropriate offset. Figure 19.l shows the layout of the portion of the activation record for this function. The compiler keeps track, in its symbol table, of the position of each variable in relation to the base pointer R5, and if the variable is an aggregate data type, it also tracks the position of each field within the variable. Notice that for the particular reference plane. altitude = a;, the compiler must generate code to access the second variable on the stack and the second member element of that variable.
Following is the code generated by the LC-3 C compiler for the assignment
statementplane.altitude = O;.
AND Rl, Rl, #0 zero out Rl
ADD RO, RS, #-12 RO contains base address of plane STR Rl, RO, #7 plane.altitude= O;
19.3_Rrra~s of Structures
Let’s say we are writing a piece of software to determine if any flights over the skies of Chicago are in danger of colliding. For this program, we will use the Flight type that we previously defined. If the maximum number of flights that will ever simultaneously exist in this airspace is 100 planes, then the following declaration is appropriate:
Flight planes[lOO];
This declaration is similar to the simple declaration int d [100], except instead ofdeclaring l00integervalues,wehavedeclaredacontiguousregionofmemory containing l00structures,eachofwhichiscomposedofthesixmembersindicated inthedeclarationstruct flightType.Thereferenceplanes[12],forexample, would refer to the thirteenth object in the region of 100 such objects in memory. Each object contains enough storage for its six constituent member elements.
Each element of this array is of type Flight and can be accessed using standard array notation. For example, accessing the flight characteristics of the first flight can be done using the identifier plane [oJ • Accessing a member field is done by accessing an element of the array and then specifying a field: plane [oJ . heading. The following code segment provides an example. It finds the average airspeed of all flights in the airspace monitored by the program.
int i; double double
for(i sum
sum= O; averageAirSpeed;
O;i<100;i++)
sum+ plane[i] .airspeed;
averageAirSpeed =sum / 100;
We can also create pointers to structures. The following declaration creates a pointer variable that contains the address o f a variable o f type F l i g h t .
Flight *planePtr;
We can assign this variable as we would any pointer variable.
planePtr = &plane[34];
If we want to access any of the member fields pointed to by this pointer variable, we could use an expression such as the following:
(*planePtr) .longitude
With this cumbersomeexpression, wearedereferencing the variableplanePtr. It points to something of type Flight. Therefore when planePtr is dereferenced, w e a r e a c c e s s i n g a n o b j e c t o f t y p e F l i g h t . W e c a n a c c e s s o n e o f it5 m e m b e r f i e l d s by using the dot operator (.).As we shall see, refering to a structure with a pointer is a common operation, and since this expression is not very straightforward to grasp, a special operator has been defined for it. The previous expression is equivalent to
planePtr->longitude
That is, the expression – > is like the deference operator •, except it is used for defcrencing member elements of a structure type.
Now we are ready to put our discussion of structures to use by presenting an example of a function that manipulates an array of structures. This example examines the 100 flights that are airborne to determine if any pair of them are potentially in danger of colliding. To do this, we need to examine the position, altitude, and heading of each flight to determine if there exists the potential of collision. In Figure 19.2, the function PotentialCollisions calls the function Collide on each pair of flights to determine if their flight paths dangerously intersect. (This function is only partially complete; it is left as an exercise for you to write the code to more precisely determine if two flight paths intersect.)
Notice that P o t e n t i a l C o l l i s i o n s passes C o l l i d e two pointers rather than the structures themselves. While it is possible to pass structures, passing pointers is likely to be more efficient because it involves less pushing of data onto the run-time stack; that is, in this case two pointers are pushed rather than 24 locations’ worth of data for two objects of type Flight.
19.3 Arrays of Structures 503
504 chapter 19 Data Structures
1 #include
2 #define TOTAL_FLIGHTS 100 3
4 I* Structure definition */
5 struct flightType {
6 7 8 9
char flightNum[7]; /* Max 6 characters */
int altitude; int longitude; int latitude; int heading; double airspeed;
/* in meters */ /• in tenths of degrees */ /* in tenths of degrees */ /• in tenths of degrees */ /* in kilometers/hour */
10
11
12 };
13
14 typedef struct flightType Flight;
15
16 int Collide(Flight *planeA, Flight *planeB); void 17 PotentialCollisions(Flight planes[]);
18
19 int Collide(Flight *planeA, Flight *planeB)
20 {
21
22
23
24
25
26
27 }
28
29 void PotentialCollisions(Flight planes[])
30 {
31
32
33
34
35
36
37
38
39
40
41 }
else
return O;
int i; int j;
for (i = O; i <. TOTAL FLIGHTS; i++) { for (j = O; j < TOTAL_FLIGHTS; j++)
if (Collide(&planes[i], &planes[j]))
printf (11 Flights %s and %s are on collision course l \n",
Figure 19.2
if (planeA->altitude == planeB->altitude)
/** More logic to detect collision goes here**/
}
planes[i] .flightNum, planes[j] .flightNum);
An example function based on the structure Flight
19.4 OqnamicMemorqRllocation
Memory objects (e.g., variables) in C programs are allocated to one of three spots in memory: the run-time stack, the global data section, or the heap. Vari- ables declared local to functions are allocated during execution onto the run-time stack by default. Global variables are allocated to the global data section and are
accessible by all parts of a program. Dynamically allocated data objects-objects that are created during run-time-are allocated onto the heap.
In the previous example, we declared an array that contained I00 objects, where each object was an aircraft in flight. But what if we wanted to create a flexible program that could handle as many flights as were airborne at any given moment, whether it be 2 or 20,000? One possible solution would be to declare the array assuming a large upper limit to the number of flights the program might encounter. This could result in a lot of potentially wasted memory space, or worse, we might underestimate the number o f flights, which could have potentially devastating repercussions. A better solution is to dynamically adapt the size of the array based on the number of planes in the air. To accomplish this, we rely on the concept of dynamic memory allocation.
In a nutshell, dynamic memory allocation works as follows: A piece of code called the memory allocator manages an area of memory called the heap. Figure 19.3 is a copy of Figure 12.7; it shows the relationship of the various regions of memory, including the heap. During execution, a program can make requests to the memory allocator for contiguous pieces of memory of a particular
xOOOO
– -PC l—————,—R4
Global data section
Heap
(for dynamically allocated memory)
1 – – – – – – – L _ _ _ _ _ _ ___,”‘_ -R6 (stack pointer) – – RS (frame pointer)
Run-time stack
Program text
xFFFF
Figure 19.3 The LC-3 memory map showing the heap region of memory
19.4 Dynamic Memory Allocation 505
50b chapter 19 Data Structures
size. The memory allocator then reserves this memory and returns a pointer to the newly reserved memory to the program. For example, if we wanted to store 1,000 flights’ worth of data in our air traffic control program, we could request the allocator for this space. If enough space exists in the heap, the allocator will return a pointer to it. Notice that the heap and the stack both grow toward each other. The size of the stack is based on the depth of the current function call, whereas the size of the heap is based on how much memory the memory allocator
has reserved for the requests it has received.
A block of memory that is allocated onto the heap stays allocated until
the programmer explicitly deallocates it by calling the memory deallocator. The deallocator adds the block back onto the heap for subsequent reallocation.
19.4.1 Dynamically Sized Arrays
Dynamic allocation in C is handled by the C standard library functions. In partic- ular, the memory allocator is invoked by the function malloc. Let’s take a look at an example that uses the function malloc:
int airbornePlanes; Flight *planes;
printf (11 How many planes are in the air?”); scanf( 11 %d”, &airbornePlanes);
planes= malloc(24 * airbornePlanes);
The function malloc allocates a contiguous region of memory on the heap of the size in bytes indicated by the single parameter. If the heap has enough unclaimed memory and the call is successful, malloc returns a pointer to the allocated region.
Here we allocate a chunk of memory consisting of 24 * airbornePlane bytes, where airbornePlanes is the number of planes in the air as indicated by the user. What about the 24? Recall that the type Flight is composed of six members~an array of7 characters, 4 integers, and a double, each occupy a single two-byte location on the LC-3. Each structure requires 24 bytes of memory. As a necessary convenience for programmers, the C language supports a compile-time operator called sizeof. This operator returns the size. in bytes, of the memory object or type passed to it as an argument. For example, sizeof (Flight) will return the number of bytes occupied by a variable of type Flight, or 24. The programmer does not need to calculate the sizes of various data objects; the compiler can be instructed to perform the calculation.
If all the memory on the heap has been allocated and the current allocation cannot be accomplished, malloc returns the value NULL. Recall that the symbol NULL is a preprocessor macro symbol, defined to a particular value depending on the computer system, that represents a null pointer. It is good programming prac- tice to check that the return value from malloc indicates the memory allocation was successful.
The function mal loc returns a pointer. But what is the type of the pointer? In the preceding example, we are treating the pointer that is returned by mal loc as a pointer to some variable of type Flight. Later we might use malloc to allocate an arrayofintegers,meaningthereturnvaluewillbetreatedasanint *.Toenable this,mallocreturnsagenericdatapointer,orvoid *,thatneedstobetypecastto the appropriate form upon return. That is, whenever we call the memory allocator, we need to instruct the compiler to treat the return value as of a different type than was declared.
In the preceding example, we need to type cast the pointer returned by mal lac to the type of the variable to which we are assigning it. Since we assigned the pointer to planes, which is of type Flight •, we therefore cast the pointer to type F l i g h t *. To do otherwise makes the code less portable across different computer systems; most compilers generate a warning message because we are assigning a pointer value of one type to a pointer variable of another. Type casting causes the compiler to treat a value of one type as if it were of another type. To type cast a value from one type to a newType, we use the following syntax. The variable var should be of newType. For more information on type casting, refer to section D.5.11.
var= (newType) expression;
Given type casting and the sizeof operation and the error checking of the return value from malloc, the correct way to write the code from the previous example is:
int airbornePlanes; Flight *planes;
printf( 11 How many planes are in the air?”); scanf (11 %d”, &airbornePlanes) ;
/*Amore correctly written call malloc */
planes= (Flight*) malloc(sizeof(Flight) * airbornePlanes); if (planes== NULL) {
printf(“Error in allocating the planes array\n”);
plane[0] .altitude
Since the region that is allocated by malloc is contiguous in memory, we can switch between pointer notation and array notation. Now we can use the expression planes [29] to access the characteristics of the 30th aircraft (pro- vided that airbornePlanes was larger than 30, of course). Notice that we smoothly switched from pointer notation to array notation; this flexibility has helped make C a very popular programming language. Other derivative languages, C++ in particular, keep this duality between pointers to contiguous memory and arrays.
The function malloc is only one of several memory allocation functions in the standard library. The function cal loc allocates memory and initializes it to the
19.4 Dynamic Memory Allocation
507
508
chapter 19 Data Structures
value 0. The function realloc attempts to grow or shrink previously allocated regions of memory. To use the memory allocation functions of the C standard library, we need to include the stdlib.h header file. Can you use realloc to create an array that adapt~ to the size of the data size-for example, write a function AddPlane (I that adds a plane if the current size of the planes is too small? Likewise, write the function DeletePlane () when the size of the array is Iarger than what is required.
A very important counterpart to the memory allocation functions is a function to deallocate memory and return it to the heap. This function is called free. It takes as its parameter a pointer to a region that was previously allocated by mal 1 oc (or calloc or realloc) and deallocates it. After a region has been free’d, it is once again eligible for allocation. Why is deallocation necessary? As we shall see, there is a class of data structures that dynamically grow and shrink as the program executes. For the shrinking operation, we put allocated memory back on the heap so that we can use it again in subsequent allocations.
19.5 LinkedLists
Having discussed the notion of structures and the concept of dynamic memory allocation, we are now ready to introduce a fundamental data structure that is pervasive in computing. A linked list is similar to an array in that both can be used to store data that is best represented as a list of elements. In an array, each element (except the last) has a next element that follows it sequentially in memory. Likewise in a linked list, each element has a next element, but the next element need not be sequentially adjacent in memory. Rather, each element contains a pointer to the next element.
A linked list is a collection of nodes, where each node is one “unit” of data, such as the characteristics of an airborne aircraft from the previous section. In a linked list we connect these nodes together using pointers. Each node contains a pointer element that points to the next node in the list. Given a starting node, we can go from one node to another by following the pointer in each node. To create these nodes, we rely on C structures. A critical element for the structure that defines the nodes of a linked list is that it contains a member element that points to nodes like itself. The following code demonstrates how this is accomplished. We use the Flight type we defined in the previous sections. Notice that we have added a new member element to the structure definition. It is a pointer to a node of the same type.
typedef struct flightType Flight; struct flightType {
};
char flightNum [7] ; int altitude;
int longitude;
int latitude;
int heading; double airspeed; Flight *next;
/* Max 6 characters
/* in meters
/• in tenths of degrees
/* in tenths of degrees
I* in tenths of degrees */ /* in kilometers/hour
*/ */ */ */
*/
Figure 19.4
Two representations for a linked list
Head
A linked list in abstract form
A linked list in memory
Tail
Head pointer
Like an array, a linked list has a beginning and an end. its beginning, or head, is accessed using a pointer called the head pointer. The final node in the list, or tail, points to the NULL value. Figure 19.4 shows two representations of a linked list data structure: an abstract depiction where nodes are represented as blocks and pointers arc represented by arrows, and a more physical representation that shows what the data structure might look like in memory.
Despite their similarities, arrays and linked lists have fundamental differ- ences. An array can be accessed in random order. We can access element number 4, followed by element 911, followed by 45, for example. A simple linked list must be traversed sequentially starting at its head. If we wanted to access node 29, then we must start at node O(the head node) and then go to node 1, then to node 2, and so forth. But linked lists are dynamic in nature; additional nodes can be added or deleted without movement of the other nodes. While it is straightforward to dynamically size an array (see Section 19.4.1 on using malloc), it is much more costly to remove a single element in an array, particularly if it lies in the middle. Consider, for example, how you would remove the information for a plane that has just landed from the air traffic control program from Section 19.3. With a linked list we can dynamically add nodes to make room for more data, and we can delete nodes that are no longer required.
NULL
19.5 Linked Lists 509
NULL
510
chapter 19 Data Structures
19.5.1 An Example
Say we want to write a program to manage the inventory at a used car lot. At the lot, cars keep coming and going, and the database needs to be updated continually-a new entry is created whenever a car is added to the lot and an entry deleted whenever a car is sold. Furthermore, the entries are stored in order by vehicle identification number so that queries from the used car sales- people can be handled quickly. The information we need to keep per car is as follows:
int vehicleID; char make [20]; char model[20]; int year;
int mileage; double cost;
Car *next;
I* Unique identifier for a car *I /* Manufacturer */ I* Model name */ /* Year of manufacture */ /* in miles
*/ I* in dollars */
/* Points to a car node
*/
In reality, a vehicle ID is a sequence of characters and numbers and cannot be stored as a single int, but we store it as an integer to make the example simpler.
The frequent operations we want to perform-adding, deleting, and searching for entries–<:an be performed simply and quickly using a linked list data structure. Each node in the linked list contains all the information associated with a car in the lot, as shown. We can now define the node structure, which is then given the name CarNode using typedef:
typedef struct carType
struct carType { int vehicleID; char make[20]; char model[20]; int year;
Car;
Unique identifier for a car */
};
int mileage; double cost;
car *next;
*/
*I
*/
*I
/*
I*
/* Points to a car node
Manufacturer
/* Model name
I* Year of manufacture */ /* in miles
I* in dollars */
Notice that this structure contains a pointer element that points to something of the same type as itself, or type Car. We will use this member element to point to the next node in the linked list. I f the n e x t field is equal to NULL, then the node is the last in the list.
1 int main() 2{
3 int op = 0; /* Current operation to be performed.
4 Car carBase; /* carBase an empty head node
5
6 carBase.next = NULL; /* Initialize the list to empty */ 7
8 printf( 11 -------------------------\n");
9 printf( 1'=== Used car database ===\n'1 ) ;
·10 printf("-------------------------\n\n"); •. ·11
12 while (op != 4) {
13 printf ("Enter an operation: \n11 ) ;
14 printf (11 1 - Car aquired. Add a new entry for it. \n") ;
15 i;,rintf("2 - Car sold. Remove its entry.\n");
16 printf( 11 3 - Query. Look up a car's information.\n");
17 printf("4 - Quit.\n");
18 scanf(11 %d11
19
20 if (op == 1)
21
22 else 23
24 else 25
26 else 27
28 else 29
30
31
AddEntry(&carBase); if (op== 2)
DeleteEntry(&carBase); if(op==3)
Search (&carBase) ;
if (op == 4)
printf ("Goodbye. \n\n");
printf ("Invalid option. Try again. \n\n");
,
&op);
Figure 19.5 The function main for our used car database program
Now that we have defined the elementary data type and the organization of data in memory, we want to focus on the flow of the program, which we can do by writing the function main. The code is listed in Figure 19.5.
With this code, we create a menu-driven interface for the used car database. The main data structure is accessed using the variable carBase, which is of type carNode. We will use it as a dummy head node, meaning that we will not be storing any information about any particular car within the fields of carBase; instead, we will use c a r B a s e simply as a placeholder for the rest o~ the linked list. Using this dummy head node makes the algorithms for inserting and deleting slightly simpler because we do not have to deal with the special case of an empty list. Initially, carBase. next is set equal to NULL, indicating that no data items are stored in the database. Notice that we pass the address o~ c a r B a s e whenever we call the functions to insert a new car in the list (AddEntry), to delete a car (DeleteEntry), and to search the list for a particular car (search).
19.5 Linked Lists
511
*/
*/
512
chapter 19 Data Structures
1 Car *ScanList(Car *headPointer, int searchID) 2
3 4 5 6 7 8 9
Car *previous; Car *current;
/* Point to start of list*/ previous= headPointer; current= headPointer->next;
/* Traverse list — scan until we find a node with a /* vehicleID greater than or equal to searchID
while ((current != NULL) && (current->vehicleID < searchID)) (
10
11
12
13
14 previous current;
15 current = current->next; 16
*/ */
*/ */ */
17
18 /* The variable previous points to node prior to the
19 /* node being searched for. Either current->vehicleID
20 /* equals searchID or the node does not exist.
21 return previous;
22
Figure 19.6 A function to scan through the linked list for a particular vehicle ID
As we shall see, the functions AddEntry, DeleteEntry, and Search all rely upon a basic operation to be performed on the linked list: scanning the list to find a particular node. For example, when adding the entry for a new car, we need to know where in the list the entry should be added. Since the list is kept in sorted order of increasing vehicle ID numbers, any new car node added to the list must be placed prior to the first existing node with a larger vehicle ID. To accomplish this, we have created a support function called ScanList that traverses the list (which is passed as the first argument) searching for a particular vehicle ID (pa~sed as the second argument). scanList always returns a pointer to the node just before the node for which we are scanning. If the node we are scanning for is not in the list, then scanList returns a pointer to the node just prior to the place in the list where the node would have resided. Why does ScanList return a pointer to the
previous node? As we shall see, passing back the previous node makes inserting new nodes easier. The code for ScanList is listed in Figure 19.6.
Next we will examine the function to add a newly acquired car to the database. The function AddEntry gets information from the user about the newly acquired car and inserts a node containing this information into the proper spot in the linked list. The code is listed in Figure 19.7. The first part of the function allocates a carNode-sized chunk of memory on the heap using mal lac. Ifthe allocation fails, an error message is displayed and the program exits using the exit library call, which terminates the program. The second part of the function reads in input from the standard keyboard and assigns it the proper fields within the new node. The third part performs the insertion by calling ScanList to find the place in the list to insert the new node. If the node already exists in the list then an error message is displayed and the new node is deallocated by a call to the free library call.
1 void AddEntry(Car *headPointer)
2{
3
4 5 6 7 8 9
/* Points to the new car info
/* Points to car to follow new one */ /* Points to car before this one */
Car *newNode; Car *nextNode; Car *prevNode;
*I
*/
/* Dynamically allocate memory for this new entry. newNode = (Car*) malloc(sizeof(Car));
if (newNode == NULL) {
printf( 11 Error: could not allocate a new node\n”); exit (1);
printf( 11 Enter the following info about the car. \n”); printf (11 Separate each field by white space: \n”) ; printf( 11 vehicle_id make model year mileage cost\n”);
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24 nextNode prevNode->next; 25
scanf(11%d %s %s %d %d %lf11 ,
&newNode->vehicleID, newNode->make, newNode->model, &newNode->year, &newNode->mileage, &newNode->cost);
prevNode ScanList(headPointer, newNode->vehicleID);
26 if ( (nextNode == NULL) 11
(nextNode->vehicleID != newNode->vehicleID)) { prevNode->next = newNode;
newNode->next = nextNode;
printf (“Entry added. \n\n”);
27
28
29
30
31 }
32 else {
33 printf (“That car already exists in the database! \n”) ;
34 printf(“Entry not added. \n\n”);
35 free(newNode); 36
37 }
Figure 19.7 A function to add an entry to the database
Let’s take a closer look at how anode is inserted into the linked list. Figure 19.8 shows a pictorial representation of this process. Once the proper spot to insert is found using ScanList, first, the prevNode’s next pointer is updated to point to the new node and, second, the new node’s n e x t pointer is updated to point to nextNode. Also shown in the figure is the degenerate case of adding a node to an empty list. Here, prevNode points to the empty head node. The head node’s n e x t pointer is updated to point to the new node.
The routine to delete a node from the linked list is very similar to AddEntry. Functionally, we want to first query the user about which vehicle ID to delete and then use s c a n L i s t to locate a node with that ID. Once the node is found, the list is manipulated to remove the node. The code is listed in Figure 19.9. Notice that once
19.5 Linked Lists 513
514
chapter 19
Data Structures
Figure 19.8
Inserting a node into a linked list. The dashed lines indicate newly formed links
1 void DeleteEntry(Car *headPointer) 2{
3
4 5 6 7 8 9
int vehicleID; Car *delNode; Car *prevNode;
/* Points to node to delete
/* Points to node prior to delNode */
nextNode
Inserting into an empty list
prevNode,:. @f&:1
‘
‘ ‘
”
NULL
/
nextNode
printf(“Enter the vehicle ID of the car to delete:\n”);
scanf(11%d11 ,
&vehicleID);
prevNode ScanList(headPointer, vehicleID);
10
11 delNode prevNode->next; 12
13
14
15
16
17
18
19
20 else
/* Either the car does not exist or */ /• delNode points to the car to be deleted. */ if (delNode != NULL && delNode->vehicleID == vehicleID) (
prevNode->next = delNode->next;
printf (“Vehicle with ID %d deleted. \n\n”, vehicleID) ; free(delNode);
21 printf(“The vehicle was not found in the database\n”); 22 }
Figure 19.9 A function to delete an entry from the database
NULL
*/
2{ 3
4
5
6
7 8 9
int vehicleID;
Car *searchNode; /* Points to node to delete to follow */ Car *prevNode; /* Points to car before one to delete */
printf(“Enter the vehicle ID number of the car to search for,\n”); scanf( 11 %d11 , &vehicleID);
prevNode = ScanList(headPointer, vehicleID);
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27 }
prevNode->next;
printf (“vehicle ID printf (11 make printf (“model printf (“year printf (“mileage
searchNode->vehicleID); %s\n”, searchNode->make);
%d\n”, searchNode->mileage);
prevNode
19.5 linked Lists 515
NULL
Figure 19.10 Deleting a node from a linked list. The dashed line indicates a newly formed link
1 void Search(CarNode •headPointer)
%d\n11
,
%s\n11
delNode
searchNode
/* Either the
/* searchNode
if (searchNode != NULL && searchNode-~vehicleID == vehicleID) {
car does not exist in the list or */ points to the car we are looking for. */
searchNode->model); %d\n”, searchNode->year);
,
/* The following printf has a field width specification on */ /* %f specification. The 10.2 indicates that the floating */ /* point number should be printed in a 10 character field */ /* with two units after the decimal displayed. */ printf (“cost , $%10. 2f\n\n”, searchNode->cost);
else {
printf( 11 The vehicle ID %d was not found 1.n the database.\n\n”,
vehicleID);
}
Figure 19.11 A function to query the database
a node is deleted, its memory is added back to the heap using the f r e e function call. Figure 19.10 shows a pictorial representation of the deletion of a node.
At this point, we can draw an interesting parallel between the way elements are inserted and deleted from linked lists versus arrays. In a linked list, once we have identified the item to delete, the deletion is accomplished by manipulating
28 29 30 31 32
516 chapter 19 Data Structures
a few pointers. If we wanted to delete an element in an array, we would need to move all elements that follow it in the array upwards. If the array is large, this can result in a significant amount of data movement. The bottom line is that the operations of insertion and deletion can be cheaper to perform on a linked list than on an array.
Finally, we write the code for performing a search. The s e a r c h operation is very similar to the AddEntry and DelEntry functions, except that the list is not modified. The code is listed in Figure 19.11. The support function scanList is used to locate the requested node.
19.6 Summar~
We conclude this chapter by a summarizing the three key concepts we covered.
• Structures in C. The primary objective of this chapter was to introduce the concept of user-defined aggregate types in C, or structures. C structures allow us to create new data types by grouping together data of more primitive types. C structures provide a small step toward object orientation, that is, of structuring a program around the real-world objects that it manipulates rather than the primitive types supported by the underlying computing system.
• Dynamic memory allocation. The concept of dynamic memory allocation is an important prerequisite for advanced programming concepts. In particular, dynamic data structures that grow and shrink during program execution require some form of memory allocation. C provides some standard memory allocation functions such as malloc, calloc, realloc, and free.
• Linked lists. We combine the concepts of structures and dynamic memory allocation to introduce a fundamental new data structure called a linked list. It is similar to an array in that it contains data that is best organized in a list fashion. Why is the linked list such an important data structure? For one thing, it is a dynamic structure that can be expanded or shrunk during execution. This dynamic quality makes it appealing to use in certain situations where the static nature of arrays would be wasteful. The concept of connecting data elements together using pointers is fundamental, and you will encounter it often when dealing with advanced structures such as hash tables, trees, and graphs.
19.1 Is there a bug in the following program? Explain.
struct” node (
int count;
struct node *next;
};
int main() {
int data~ O;
struct node *getdata;
getdata->count ==data+ 1; printf (“%du, getdata->count);
19.2 The following are a few lines of a C program:
struc-c node {
int count;
struct node *next;
};
main () {
int data== O;
struct node *getdata;
getdata getdata->next;
Write, in LC-3 assembly language, the instructions that are generated by the compiler for the line getdata ~ getdata- >next;.
Exercises 517
518 chapter 19 Data Structures
19.3 The code for PotentialCollisions in Figure 19.2 performs a pairwise check of all aircraft currently in the airspace. It checks each plane with every other plane for a potential collision scenario. This code, however, can be made more efficient with a very simple change. What is the change?
19.4 The following program is compiled on a machine in which each basic data type (pointer, character, integer, floating point) occupies one location of memory.
struct element {
char namel25];
int atomic_number; float atomic_mass;
};
is it_noble(struct element t[], inti)
{
}
int main()
(
a. b.
int x, y;
struct element periodic_table[llO];
x is_it_noble(periodic_table, y);
How many locations will the activation record of the function is it noble contain?
Assuming that periodic_table, x, and y are the only local variables, how many locations in the activation record for main will be devoted to local variables?
if ( (t [i] .atomic_number==2) 11 (t[i] .atomic_number==lO) II (t[i] .atomic_number==l8) II (t [i] .atomic_number==36) 11 (t[i] .atomic_number==54) II (t [i] .atomic number==86) I
return 1; else
return o;
19.5 The following C program is compiled into the LC-3 machine language and executed. The run-time stack begins at xEFFF. The user types the input abac followed by a return.
#include
struct char rec char ch;
struct char rec *back; };
int main() {
a. b.
struct char_rec *ptr, pat(MAX+2]; inti= 1, j =1;
printf(“Pattern: 11 ) ; pat[l] .back= pat; ptr = pat;
while ((pat[i].ch = getcharI)) != ‘\n’) {
ptr[++i] .back= ++ptr;
if (i > MAX) break;
}
while (j <= i)
printf("%d ", pat[j++] .back - pat);
/* Note the pointer arithmetic here: subtraction of pointers to structures gives the number of structures between addresses, not the number of memory locations*/
Show the contents of the activation record for main when the program terminates.
What is the output of this program for the input abac?
Exercises
519
A.l Overview
The Instruction Set Architecture (ISA) of the LC-3 is defined as follows:
Memory address space 16 bits. corresponding to 216 locations, each containing one word (16 bits). Addresses are numbered from 0 (i.e, x0000) to 65,535 (i.e., xFFFF). Addresses are used to identify memory locations and memory-mapped I/O device registers. Certain regions of memory are reserved for special uses. as described in Figure A. I.
Bit numbering Bits of all quantities are numbered, from right to left, starting with bit 0. The leftmost bit of the contents of a memory location is bit 15.
Instructions Instructions are 16 bits wide. Bits [15: 12] specify the opcode (operation to be performed). hits [11:OJ provide further information that is
Figure A.l
IVlcmory map of the LC-3
xoooo
xOOFF x0100
x01 FF x0200
x2FFF x3000
xFDFF xFEOO
xFFFF
v
. /
Trap Vector Table
Interrupt Vector Table
Operating system and t Supervisor Stack
"
user programs
Device register addresses
v L, Available for
The ~C-3 ISA
522 appendix a T:-ie LC-3 ISA
needed to execute the instruction. The specific operation of each LC-3
instruction is described in Section A.3.
Illegal opcode exception Bits [15: 121 = I IO I has not been specified. If an instruction contains 1101 in bits 115: 121, an illegal opcode exception occurs. Section A.4 explains what happens.
Program counter A 16-bit register containing the address of the next instruction to be processed.
General purpose registers Eight 16-bit registers, numbered from 000 to 111.
Condition codes Three I-bit registers: N (negative), Z (zero), and P (positive). Load im,tructions (LO. LOI, LOR. and LEA) and operate instructions (ADD, AND, and '.'/OT) each load a result into one of the eight general purpose registers. The condition codes are seL based on whether that result, taken as a 16-bit 2's complement integer, is negative
(N= l:Z.P=0),zero(Z= l:N.P=0),orpositive(P= l:N.Z=0). All other LC-3 instructions leave the condition codes unchanged.
Memory-mapped 1/0 Input and output are handled by load/store (LDI/STL LDR/STR) instructions using memory addresses tu designate each 1/0 device register. Addresses xFE00 through xFFFF have been allocated to represent the addresses of I/O devices. See Figure A. I. Also, Table A.3 lists each of the relevant device registers that have been identified for the LC-3 thus far, along with their corresponding assigned addresses from the memory address space.
Interrupt processing I/O devices have the capability of interrupting the processor. Section A.4 describes the mechanism.
Priority level The LC-3 supports eight levels of priority. Priority level 7 (Pl ,7) is the highest: PLO is the lowest. The priority level of the currently executing process is specified in bits PSR[ I0:8].
Processor status register (PSR) A 16-bit register, containing status information about the currently executing process. Seven hits of the PSR have been defined thus far. PSR[l5l specifies the privilege mode of
the executing process. PSR[ I0:81 speciltes the priority level of the currently executing process. PSR[2:0] contains the condition codes. PSR[2] is N, PSR[ I] is Z. and PSRI 01 is P.
Privilege mode The LC-3 specifies two levels of privilege, Supervisor mode (privileged) and User mode (unprivileged). Interrupt service routines execute in Supervisor mode. The privilege mode is specilied by PSR[ 15]. PSR[ 15J = 0 indicates Supervisor mode: PSR[ I5] = I indicates User mode.
Privilege mode exception The RT! imtruction executes in Supervisor mode. If the processor attempts to execute an RT! instruction while in User mode, a privilege mode exception occurs. Section A.4 explains what happens.
Supervisor Stack A region of memory in supervisor space accessible via the Supervisor Stack Pointer (SSP). When PSR[ I5] = 0, the stack pointer (R6J is SSP.
User Stack A region of memory in user space accessible via the Cser Stack Pointer (USP). \Vhcn PSR[IS] =I.the stack pointer (R6) is USP.
A.2 Notation
The notation in Table A. I will he helpful in understanding the descriptions of the LC-3 instructions (Section A.3).
A.3 TheInstructionSet
The LC-3 supports a rich. but lean, instruction set. Each 16-bit instruction consists of an opcode (bits[ 15: 12]) plus 12 additional bits to specify the other information that is needed to carry out the work of that instruction. Figure A.2 summarizes the 15 different opcodes in the LC-3 and the specification of the remaining bits of each instruction. The 16th 4-bit opcode is not specified, but is reserved for future use. In the following pages. the instructions will be described in greater detail. For each instruction. we show the assembly language representation. the format of the 16-bit instruction, the operation of the instruction, an English-language description of its operation. and one or more examples of the instruction. Where relevant, additional notes about the instruction are also provided.
A.3 The Tnstruclion Set
524 appendix a The l C 3 ISA
Table A, ' Notation
xNumber #Number A[l:rl
BaseR
DR
LABEL
mem1 address I
offset6
PC
PCoffset9
PCoffsetl 1
PSR
setcc()
SFXTIAI
SP
SR, SRI, SR2
SSP trapvcctS
USP ZEXTIAI
'ir .ltotational Conventions @>
Meaning
The number in hexadeC:mal notation.
Tl”e number in decimal notation.
The field del:mited by bit [IJ on the left and b’t [rl on the right, of the datum A. For
example, if PC contains 001100110011] 111, then PCfl5:9l is 0011001. PCL2:2_ is 1. If Iand rare the same bit nur1ber tbe notation is L:sually abbreviated PC[2l
1
Base Register; one of RO .. R7, used in corijunction with a six-bit offset to compute
Basc+offsct addresses.
Destination Register; one of RO.. R7, which specifies which register the re:,ult of an instruction should be written to.
A 5-bit imricdiu.te value; bits L4:0.l of an instruction when used as a literal (im11ediate) value. Taker, as a 5-bit., Z’s complement integer it is sign-extended t:J
1
16 hits bC’fore it is used. Range: -16 .. 15.
An assembly language construct that identifies a location symbolically (i.e., by mear~
::if a name, rather tha11 its 16-bit address).
Denotes the contents of memory al the given address.
A 6-bit value; bits [5:0J of an instruction_; used with the Base…;,_offset addres<:iing mo
Bits [5:0J are taken as a 6-bit signed 2'5 C0/11p)ement integer, sign-extended to
16 bits and then added to the Base Register to form an address. Range: -32..31. Program Counter; 16-bit register that contains the rnemory address of the 11ext
instruction to be fetched. For exarnp1e during execution of the instruction at addre::
1
A, the PC contains address A + 1, indicating the next instructi-:in is contained in
A+ 1.
A 9-bit value; bits [8:0J of an instruction; used with the PC+offset addressing mode.
Bits [8:0J are taken as a 9-bit signed 2's compiement inleger, sign-extended to lt
bits and then added to the incremented PC to form an address. Range -256..255. An 11-bit value; bits [lO:OJ of an instruction; used with the JSR opcode to computE
the target address of a subroutine call. Bits [10:0J are taken as ~n 11-bit 2's complement integer, s'.qn-extendeci to 16 bits and then added to the incremented F _ to form the target address. Range -1024 .. 1023.
Processor Status Register; 16-bit register that contairis status ;nformation of the process that is running. PSRI 15, = privilege :node. PSR[2:0J contains the condit':- codes. PSR[Z] = N, PSR[l] = Z, PSR[0] = P.
Indicates that condit;on codes N, Z, and Pare set based on the value of the result written to DR. If the value is 11egative, N = 1, Z = 0, P = 0. If the value is zero, N=O,Z=-1,P=-O.Ifthevalueispositive,N=0,Z=01 P=1.
Sign-extend A. The most significant bit of A is replicated as many times as necessar'., ::
extend A to 16 bits. For example, if A = 110000, then SEXTIAI = !Ill !Ill
II 11 0000.
The current stack pointer. R6 is -t:he current stack pointer. There are two stacks1 one
for each privilege mode. SP is SSP i' PSR[lo] = 0; SP is USP if PSR[lo] = I.
Sour-ce Register; one of RO.. R7 which specifies the register from which a source operand is obtained.
The Supervisor Stack Pointer.
An 8-bit value; bits L7:0l of an instruction; used with the fRAr opcode to determir.~
the starting address of a trap service routine. Bits [7:0J are taken as an ursigned integer and zero-extended to 16 bits. This is the arldress of the memory location containing the starting address of the corresponding service routine. Range 0.. 255
The User Stack Pointer.
Zero-extend A. Zeros are appended to the ieftrrost b;t of A to extend it to 16 bits. F:,
example, if A= 110000, then ZEXT(AI = 0000 0000 00ll 0000.
A.3 The Instruction Set 525 1514131211 109 87 6543210
ADD+ j :00:01:I:DR:I:sRiIoo:oI:sR~I ADD+ I:00:01:[:DR:I:sRiI1:i~m~ : I AND+ I :01:01 :.. :DR: I :sRi Io o:o I :sR~ [
AND+ I :0101:
BR [ JMP I
JSR I JSRR I
LO+ I
LOI+ I
LOR+ I
'DR: I SRi I1 : i~m~ n Iz IPI : : :P~offs~t<
NOT+ I
RET I
RTI I
ST I
STI I
STR I
TRAP I
reserved
100< :DR:
:11:00:I :ooo:
:10:00:I : :
:00:1ff~et6:
1
:oo~ooo:
:
1
[oI o~ I ~ase:R I : :ooo,aoo,
. :ooo: I ~as~R I
: 01:00: I 1I : : : :PC7ffs~t1<
I : : : Psoffs~t9: : 111
LEA+ I >1:10: :DR,
I ~as~R I : >ff~~ 1111:I:0~00:I::t:rap~ect~::I
1101
Figure A.2 Format of the entire LC-3 insti-uction set. Note:+ indicates instructions that modify condition codes
1
526 appendix a The LC-3 ISA
ADD
Assembler Formats
ADD DR, SRI, SR2 ADD DR, SR\, imm5
Addition
Encodings
! 15 12I11
0001
•Is
ILi-s—‘-:_0—-‘o:0_1_,__:-12-‘-l-11_L_D:_R_J_:_ · Operation
if (b~l[::,J ==C) DR==SRl t-SF.2;
else setcc () ;
Description
DR
SRI 61 ~1·00312 :SR2: 0 I -s—-‘:s_R_lL_:6_,___s-‘-1_·-1:~imL__m_j_5——–‘-:-o_Jl
DR ~ SRl +- SEX’:.’ (imm5 l ;
If bit [5] is 0, the second source operand is obtained from SR2. If bit [5] is 1, the second source operand is obtained by sign-extending the imm5 field to 16 bib. Tn hoth cases, the second source operand is added to the contents of SR I and the result stored in DR. The condition codes are set, based on whether the result is negative, zero, or positive.
Examples
ADD R2, R3, R4 ; R2 +- R3 +R4 ADD R2,R3,#7 ;R2+-R3+7
AND
Assembler Formats
AND DR, SR I, SR2 AND DR, SRI, imm5
Encodings
15 12 I 11 1
Bit-wise Logical AND
9 6 DR:I:SRI:6I
~I4~0312:SR2:0 I 11s:01:01:12111:DR:91B :SRI:6 I: I• : i~m~ : 0 I
Operation
if {bit[~)] == 0,)
:S·R c:.- S”.=
Description
The condition codes specified by the state of bits [11 :9] arc tested. If bit [11] is set, N is tested; ifhit [I I] is clear, N is not tested. If bit [IOJ is set, Z is tested, etc. If any of the condition codes tested is set. the program branches to the location specified by adding the sign-extended PCoffset9 field to the incremented PC.
Examples
BRzp LOOP : Branch to LOOP if the last result was zero or positive.
1
BRt
‘.’!EXT ; Unconditionally branch to NEXT.
The assembly language opcode BR is ir,te1-preted the same as BRr-zp; that is, ahivays branch to the target address.
+This is the :ncremented PC.
JMP
RET
Assembler Formats
JMP BaseR RET
Encoding
A.3 The lnstruction Set 529
Jump
Return from Subroutine
15 0
12111 JMP I ,,.~o:
[ 15
RET 1100 000
Operation
PC~:.::..: BaseR;
Description
1~,,:15 :
~oo: •1sB~se/ls: 1211 98 6
:oo~ooo=:J 0
~00:000:
The program unconditionally jumps to the location specified by the contents of the base register. Bits [8:61 identify the hasc register.
Examples
JMP R2 ; PC –<-- R2 RET ; PC -<-- R7
Note
The RET instruction is a special case of the JMP instruction. The PC is loaded with the contents of R7, which contains the linkage back to the instruction following the subroutine call instruction.
530 appendix a The LC-3 ISA
JSR
JSRR Assembler Formats
Jump to Subroutine
JSR LABEL JSRR BaseR
Encoding
15
JSR
JSRR
12
1I 10
'j 8650
Operation
TEMP°" PC:·-
i f i:i)::.t.1111 c·,
PC =B2.::,2R·
P:' -- P::: -t- SEX:- P·:-::·2ff2~t.::_:_: R7 =TE[\.;p;
Description
BaseR
0 0 0 0 0 0
First, the incremented PC is saved in a temporary location. Then the PC is loaded with the address of the first instruction of the subroutine, causing an unconditional jump to that address. The address of the subroutine is obtained from the base register (if bit [11] is 0), or the address is computed by sign-extending bits 110:0I and adding this value to the incremented PC (if bit 11 lj isl). Finally, R7 is loaded with the value stored in the temporary location. This is the linkage back to the
calling routine.
Examples
JSR QUEUE : Put the address or the instruction following JSR into R7: : Jump to QUEUE.
JSRR R3 ; Put the address following JSRR into R7: Jump to the : address contained in R3.
-This is the incremented PC.
I_
LO
Assembler Format
LD DR.LABEL
Encoding
15 1211I98I
0I I'
1
11
i 0~10
I i i i 1
Operation
I.JR= rnen1 [
Description
DR
PCoffset9
i
1
-- _..1__ ~-~~~-~~~
An address is computed hy sign-extending bits [8:0] to 16 hits and adding this value to the incremented PC. The contents of memory at this address are loaded into DR. The condition codes arc set. based on whether the value loaded is negative, zero, or positive.
Example
LD R4, VALUE ; R4 <-- mem[VALUE[
This is the incremented PC.
A.3
The Iristruction Set 531
Load
532 appendix a The LC-3 ISA
LOI
Assembler Format
LDI DR. LABEL
Encoding
Operation
setcc(; i
Description
Load Indirect
An address is computed by sign-extending bits [8:0] to 16 bits and adding this value to the incremented PC. What is stored in memory at this address is the address of the data to he loaded into DR. The condition codes are set, based on whether the value loaded is negative. zero, or positive.
Example
1
This is the incremented PC.
LDI R4. ONEMORE : R4
void UpcaseString(chu.r inputStr.:._ng[]);
mair_
( l
char string [B] ;
s c a n t ( ” % s 11 UpcaseString (string);
,
string) ;
void UpcaseStri:1g(char inputStr~ng[]) ir_t i = 0;
while (i:iputString [i: ) {
it ((1a’ <= inpu::.St:i::-ing[i]) && (inputStr:::._ng[iJ <=- 'z';)
inputStri:19 :il = inputStr~ng [i] - (' a' - 'A' I; i++;
Figure B.8 C source code for the upper-/lowercase pragra.11
uppercase: converts lower- to uppe~case .ORIG x3000
T,EA RE, STACK
MAIN !\DD READCHAR nr
read in i~put string: scanf
pu~ ir.. NULL char to mark the ,rend'1
get tr.e starting address c,f tr.e string pass the parameter
add index to starting addr of string Done i::" NCLL char reached
'a' <= ~nput string
input stri~g <= 'z' convert to Jppercase
increment the array index, i
Rl, R6, #3
Oc.lT
STR RO,
ADD RlI
ADD R2,
3Rnp READCHAR ADD Rl, Rl, #-1 STR R2, R1, J/0 A.:JD Rl, R6, !I3 STR Rl, R6, 11::_4 STR R6, R6, #13 A~D R6, R6, #11 JSR UPPERCASE HALT
CPPERCASE STR
AND Rl, Rl, #0
STR Rl, R6, #4 LDR R2, R6, #3
CONVERT ADD
LOR R4, R3, #0
ERz DONE
LD RS,a ADD ~5, RS,
LD ~5,z
ADD RS, R4,
BRp NEXT
LD RS, asubA ADD RS:I R4,RS STR R4:' R3, #0
BRnzp CONVERT
NEXT ADD
STR Rl, R6, #4
DONE LDR
~DR R6, R6, #2
RET
a .FILL z .FILL asubA .FILL ST!KK .BLKh1
#-97 #-122 #32 lOC
Figure B.9
LC-3 assembly language code for the upper-/lowercase program
.EN'l
Rl, #0 Rl/ #~ RO, x-A
R7, R6, #1
R.3, Rl,
R2
R4
Rl, Rl, #1
R7, R6, #1
RS
B.3 An Example 563
564
appendix b
From LC-3 to xS6
.386P
.model FLAT
DAT}\
$SG3 97
DA T A
TEXT
SEGMENT :rn
E0!DS
SEGM:C:NT
'ts' , OOH
T~'1e NULL-terr:.i::ated scanf fo~ma::. string is s:.ored :..n global Gata space.
Loca:.io:: of 1'st~ing'1 in local stack
string$ - -8
main
PROC NEAR
sub esp, 8
lea eax, ~WORD PTR stringS[esp+S]
push eax i PLsh argume::-its to scanf pus_1 OF"SET FL A T ,$S8397
call scan:
main
Release ~ocal stack space
11 inputStriEg" locat:..0::1 in local s~ack
lea ecx 1 DWORD PTR push ecx
call _UpcaseString
add esp, 2 o
ret O 3NDP
string$[esp-16]
; Push a~gume::t to CycaseSt~ing
inputStrLog$ = B _UpcaseString PROC NEAR
;
mov ecx, CWORD P~~ cm:;:, BYTE PT~ [ecx], 0 je SHORT $L404
$L403, mov al, BYTE P~R [ecx]
cmp J 1 cmp jg Sl.-.b mov
al, 97
SEOR~ $L405
al, 122
SHORT $L405
al, 32
BYTE PTR [ecx], al
If inpucString[J] ==0, skip t~e :oop LoaG inputString[il into AL 97=='a'
122 == I :Z f
3?. == 'a' - 'A~' i++ %$
Loop if inputStr~ng[i] != 0
$L40~, inc ecx
'.TIOV al, BYTE PT~ :ccx]
tes:. al, al
jne SHORT $L4J3 $L404, re:: 0
UpcaseString ENDP
-
TEXT ENDS
END
Figure B.10 x86 assembly lar,guage code for the uppe1--/lo1ivercase program
Allocate stack space to store 1·st~ing''
inputStr1ng$ :esf-~]
We have seen in Chapters 4 and 5 the several stages of the instruction cycle that must occur in order for the computer to process each instruction. If a microar- chitecture is to implement an ISA, it must be able to carry out this instruction cycle for every instruction in the ISA. This appendix illustrates one example of a microarchitecture that can do that for the LC-3 ISA. Many of the details of the microarchitecture and the reasom for each design decision arc well beyond the scope of an introductory course. Howewr, for those who want to understand how a rnicroarchitecture can carry out the requirements of each instruction of the LC-3 ISA. this appendix is provided.
C.l Overview
Figure C. 1 shows the two main components of an ISA: the dClla puth, which contains all the components that actually process the instructions, and the control, which contains all the components that generate the set of control signals that are needed to control the processing at each instant of time.
We say, "at each instant of time," but we really mean during each clock cycle. That is. time is divided into clock c,·c!es. The cycle time of a microprocessor is the duration of a clock cycle. A common eyele time for a microprocessor today is 0.5 nanoseconds, which corresponds to 2 billion clock cycles each second. We say that such a microprocessor is operating at a frequency of 2 gigahertz.
At each instant of time-or, rather, during each clock cycle-the 49 control signals (as shown in Figure C. I I control both the processing in the data path and the generation of the control signals for the next clock cycle. Processing in the data path is controlled by 39 hits, and the generation of the control signals for the next clock cycle is controlled by IO bits.
Note that the hardware that determines which control signals arc needed each clock cycle does not operate in a vacuum. On the contrary. the control signals needed in the "next" clock cycle depend on all of the following:
I. What is going on in the current clock cycle.
2. The LC-3 instruction that is being executed.
3. The privilege mode of the program that is executing.
4. If that LC-3 instruction is a BR. whether the conditions for the branch have been met (i.e., the state of the relevant condition codes).
The Microarchitecture of the LC-3
566
appendix c
The Micrnarcnitcct'J,..e of the LC-3
INT R
I
BEN PSR[15]
IR[1511]
Data,, Inst /16
V
--
Data / /16
, J-~,J 37
Control
, '49 ,_
Control Signals
I
J10
{39
(J, COND, IRD)
Figure C.l Micraarchitecture of the LC-3, rnajor componen1..s
5. Whether or not an external device is requesting that the procc"or he inten-upted.
6. If a memory operation is in progress, whether it is completing during this cycle.
Figure C. I identifies the speci tic information in our implementation of the LC-3 that corresponds to these five items. They are, respectively:
1. J[5:0J. COND[2:0J. and TRD-10 bits of control signals provided hy the cmTent clock cycle.
2. inst[ 15: 12], which identifies the opcode. and inst[ 11: 11], which differentiates JSR from JSRR (i.e.. the addressing mode for the target of the subroutine call).
3. PSR[l5], bit [15] of the Processor Status Register. which indicates whether the ctlffent program is executing with supervisor or user privileges.
4. BEN to indicate whether or not a BR should be taken.
5. TNT to indicate that some external device of higher priority than the
executing process requests service.
6. R to indicate the end of a memory operation.
Memory, 1/0
Data Path
~
,
/ 16
I Addr
"
/
2
C.2 TheStateMachine
The behavior of the LC-3 microarchitecture during a given clock cycle is com- pletely determined by the 49 control signals, combined with nine bits of additional information (inst I15: l l ], PSRI 151, BEN, INT, and R), as shown in Figure C. I. We have said that during each clock cycle, 39 of these control signals determine the processing of information in the data path and the other 10 control signals combine with the nine bits of additional information to determine which set of control signals will be required in the next clock cycle.
We say that these 49 control signals specify the state of the control struc- ture of the I,C-3 microarchitecture. We can complete!y describe the behavior of the LC-3 microarchitecture by means of a directed graph that consists of nodes (one corresponding to each state) and arcs (showing the flow from each state to the one[s] it goes to next). We call such a graph a state machine.
Figure C.2 is the state machine for our implementation of the LC-3. The state machine describes what happens during each clock cycle in which the computer is running. Each state is active for exactly one clock cycle before control passes to the next state. The state machine shows the step-by-step (clock cycle-by-clock cycle) process that each instruction goes through from the start of its FETCH phase to the end of that instruction, as described in Section 4.2.2. Each node in the state machine corresponds to the activity that the processor carries out during a single clock cycle. The actual processing that is performed in the data path is contained inside the node. The step-by-step flow is conveyed by the arcs that take the processor from one state to the next.
For example, recall from Chapter 4 that the FETCH phase of every instruction cycle starts with a memory access to read the instruction at the address specified by the PC. Note that in the state numbered 18. the MAR is loaded with the address contained in PC. the PC is incremented in preparation for the FETCH of the next LC-3 instruction, and, if there is no interrupt request present (INT= 0), the flow passes to the state numbered 33. We will describe in Section C.6 the flow of control if INT= 1, that is, if an external device is requesting an interrupt.
Before we get into what happens during the clock cycle when the processor is in the state numbered 33, we should explain the numbering system~that is, why 18 and 33. Recall, from our discussion of finite state machines in Chapter 3, that each state must be uniquely specified and that this unique specification is accomplished by means of the state variables. Our state machine that implements the LC-3 ISA requires 52 distinct states to describe the entire behavior of the LC-3. Figure C.2 shows 31 of them plus pointers to three others (states 8, 13. and 49). Figure C.7 shows the other 18 states (plus 8, 13, and 49) that are pointed to in Figure C.2. We will come into contact with all of them as we go through this appendix. Since k logical variables can uniquely identify 2k items, six state variables are needed to uniquely specify 52 states. The number next to each node in Figure C.2 is the decimal equivalent of the values (0 or I) of the six state variables for the corresponding state. Thus, the state numbered 18 has state variable values 010010.
Now, then, back to what happens after the clock cycle in which the activity of state 18 has finished. Again, if no external device is requesting an interrupt.
C.2 The State IVlachine
568
appendix c
The rl icrocJ.rcl1itecture cf the LC-3
J'o .!'.;
(,,.... . """'!:,
1'1Alk-ZEXrlRl71llll) I / I \
To 1:--
Figure C.2
A state 111acl1ine for tre LG-3
(DR<-~-OTi'SR_1"J,Y
//
I I III
To I:--
//
I I!I
I
/I I
I
\, SdCC _,, // _//\
I ~-----~-- I!
t
I
'-- / l///~10 '/r,1/-.
.-''-----
,?-
~j l---" l ---'---
I \-1AR<-PC-roff(J I
\..______,)
),
____
j
I /)
\ ti\1AR<-PC+liff9)
~--~---=(,>
, “· ;——… I\1DR< \ll\tARJ i ;
'._____j'~
I / • / i -""""-1-
::' \
/ /C
\
IMDR<-Mf!VL\R]
1
\
'
, MAR< \-lDR , \__ _ _ _
---~ - j
~1AR<-PC+olt0 ) 1
r . ___J:,S (\1AR< PC\
PC<-PC:+ 1!+ ------- IIKTI
------
-------.::___
----,,,,\
\
Of ------,..._ Tt, 4') /~/'s
I/
,~ ).~,\1DR<-?vl
\ 1_eebgureC.i,
,/ I I III\
Ii"'~:;;i-, ,
~)
__
)
R RT _, I IR <-~!DR
'·--~--j
IR.l IJ & ~ + IR[IU] & 7. + 1R19l & P llRlI;.12JJ
I
\
5
llill
''
\
II
Lill/ \r
I II
'
I II
I II
RR
'.":VlAR<-U+offD
-' I ~ / ' \-1AR<-\1DR "-, (\1AR<-PC+of~::)
\>~-~’-~!. NOTES
(
,mR<-SR )
~------/
B..-off(, - B~.,e + SEXT!,Jlt'~et6) PC+dl0 : PC + SEXT {,,ffsetO] PC1uffll PC-SEXTlnff<;etli
OP2 nm; be SR2 o: SIXT[iPn'
the flow passes to state 33. In state 33, since the MAR contains the address of the instruction to be processed, this instruction is read from memory and loaded into the MOR. Since this memory access can take multiple cycles, this state continues to execute until a ready signal from the memory (R) is asserted, indicating that the memory access has completed. Thus the MOR contains the valid contents of the memory location ,pecilied by MAR. The state machine then moves on to state 35, where the instruction is loaded into the instruction register (JR), completing the fetch phase of the instruction cycle.
Note that the arrow from the last state of each instruction cycle (i.e., the state that completes the processing of that LC-3 instruction) takes us to state 18 (to begin the instruction cycle of the next LC-3 instruction).
C.3 The Data Path
The data path consists of all components that actually process the information during a cycle-the functional units that operate on the information, the registers that store information at the end of one cycle so it will be available for further use in subsequent cycles, and the buses and wires that carry information from one point to another in the data path. Figure C.3, an expanded version of what you have already encountered in Figure 5.9, illustrates the data path ofour microarchitecture of the LC-3.
Note the control signals that are associated with each component in the data path. For example, ALUK, consisting of two control signals, is associated with the ALC. These control signals determine how the component will be used each cycle. Table C.1 lists the set of control signals that control the clements of the data path and the set of values that each control signal can have. (Actually, for readability, we list a symbolic name for each value, rather than the binary value.) For example, since ALUK consists of two bits, it can have one of four values. Which value it has during any particular clock cycle depends on whether the ALU is required to ADD, AND, '.\!OT, or simply pass one of its inputs to the output during that clock cycle. PCMCX also consists of two control signals and specifies which input to the MUX is required during a given clock cycle. LO.PC is a single-bi! control signal, and is a O(NO) or a I (YES), depending on whether or not the PC is to be loaded during the given clock cycle.
During each clock cycle, corresponding to the "current state" in the state machine, the 39 bits of control direct the processing of all components in the data path that are required during that clock cycle. The processing that takes place in the data path during that clock cycle, as we have said, is specified inside the node representing the state.
C.4 TheControlStructure
The control structure of a microarchitccture is specified by its state machine. As described earlier, the state machine (Figure C.2) determines which control signals are needed each clock cycle to process information in the data path and which
C.4 lhe Control Structure 569
570
appendix c
The MicrnarchtectLne o: u~e LC-3
Figure C.3
T'-1e LC-3 dala path
GateMARMUX
Gate PC
LDPC t>j~_P~C~
• a’
-C>! MARMUX \, 16 16
16
3 REG
ZEXT [7:0]
[10.0] ~ 0 -~
IL ADDR2MUX L_~_J-
~, [ 8 O J ,[5:0]
IR
16
GateMDR
16
/·.. SEXTf——-r’——–~
[4:0]
-LD.IR
~—
\ I- ~ • • • CONTROL
2 ALUK
LO.CC
fl -LO.MOR MAR
I MOR ~_”<:?-MIO.EN
R MEMORY
LO.MAR
16
16
~
16
16
ADDR1MUX
I -\~ 6 16 16 16
~-~-~
2 -+----M-E_M_.E_N-----i'7/[+-----• . .
INMUX
16
DR -/-C> LO.REG
FILE
16
3 SR2 SR2 OUT
SR1
OUT SR1
SR2MUX/
\___ /
~I~~- – r – – : – – – D D ~ U T P U T
KBSR :
. .. ~.
~ t J…
1
16
– GateALU
3
Figure C.4
The contrnl structure of a microprogram11ed imp ementation, ::>verall block diagram
I i
(J, COND, IRD)
‘J’
II
INT R
//10
/39
V
IRl15:11]
BEN PSRl15]
1J
Microsequencer
!
.16
‘1
Control Store 26×49
/49
V
Microinstruction
control signals are needed each dock cycle lo direct the !low of control from the currently active state to its successor state.
Figure C.4 shows a block diagram of the control structure of our imple- mentation of the LC-3. Many implementations arc possible. and the design considerations that must be studied to detem1ine which of many possible implementations should be used is the subject of a full cuurse in computer architecture.
We have chosen here a straightforward microprogrammed implementation. Each state of the control structure requires 39 bits to control the processing in the data path and 10 bits to help determine which state comes next. These 49 bits are collectively known as a microinstmction. Each microinstruction (i.e., each state uf the stale machine) is stured in one 49-bit location of a special memory called the control store. There are 52 distinct states. Since each state corresponds to one microinstruction in the control store. the control store for our microprogrammed implementation requires six bits to specify the address of each microinstruction.
C.4 The Control StruclLll’e
Table G :’Data Path Control Signals
Signal Name
LO.MAR/!: LD.’~D·,11:
Signal Values ‘lO, LOAD
i,C, LOAD LD.IR/1: r\D,lOAD
LO.REN/!: LD.REG/1: LD.CCIl: LD.PC/1: LD.Priv/1: LD.SavedSS P/1: LD.SavcdUSP/1: LD.V ecto1·/l:
Gatei-1C/l:
GateMDR/1: GatcALU/1: GateMARMUX/1: GateV ector/1: GatePC-1/1: GatePSR/1: GateSP/1:
PC’~UX/2:
ORM UX/2:
SRlMUX/7:
ADDRlMUX/1: ADDR2M UX/2:
SPMUX/2:
MARM UX/l:
VectorM UX/2:
PSRMUX/1: ALUl2:
~0,LOAD 1\0, LOAD IIO, LOA~ NO, LOAD NO, LOA.D NO, LOAD NO, LOAD NO, LOAD
NO, YES NO, YES ,10, YES ~IC, YES ~JO, Y!:_S 1,0, '{ES 'JO, YES NO, YES
P~ 11
BUS f,DDER
11.9
R7
SP
11.9
8.6
SP
PC, BaseR
zrno offset6 PC:offset9 PCoffsetl 1
SP-1 SP-1 SavcdSSP Saved LSP
;select pc+l
;se Pct value from bus
;seiect output of address adde(
,:destination IRll l :9J
;destination R7 ;des1.i11ation R6
;source IR[l7._:9- ;source I RL8:6J ;source R6
;select the value zero ;select SEXHIR[o:Oll ;select SEXT[J RIS:0 II ;se ect SEXTLJRUO:OJJ
;selec: slack pointer+l ;select stack poimer~l
7.D ;select ADDER ;select
INTV
r .,iv .exceptio1-, :Jpc.exceptio11
inrl'vidual settings, BUS
ADD, AND, NOT, PAS SA MIO.EN/1: NO, YES
R.W/1: RJ, WR
Set. Priv/1: O ;Supervisor mode 1 ;User ·node
;select ;select
saved SL;pe:-visor Stack Pointer saved User S1ack Pointer
ZEXT[JRr7:0ll
output of address adder
Jl5] J[4]
J[3l
J[1]
J[0]
Table C.2'.:
Signal Name
J/6:
COND/3:
lRD/1:
C.4 The Control Stn1ctu1·t 573 ~l\crosequencer Control Signals
Signal Values
INT
Interrupt present
PSR!15] BEN
1
User Branch privilege
mode J[2]
IRl11 I
Addr. mode
CON DD CON DI CON D2 COND3 COND4 CONDS
NO, YES
; Uncond.tional
; rv4er~.ory Ready
; Branch ;Addressirg Mo::le ;Pr"vilege Mode
; lnteri-upt test
Table C.2 lists the function of the IO bits of control information that help determine which state comes next. Figure C.5 shows the logic of the microse- quencer. The purpose of the microsegucncer is to determine the address in the control store that corresponds to the next state, that is, the location where the 49 bits of control information for the next state arc stored.
COND2 COND1
CONDO
R
Ready
·y y ~ ~ ~ 0.0,IR[15:12]
t
6
6
---IR D
Address of next state
Figure C.5 [he 1ricrosequencer of the LC-3
574
appendix c:
The l\llicroarchitectLwe of the LC 3
IR[11 9] 110 111
Figure C.6
(c)
Additional logic requ:,ed to provide control signals
110 ----- DRMUX~ SR1MUX
(a)
IR[11 9] L
N z p
(b)
L o g i c
BEN
'.\/otc that state 32 of the state machine (Figure C.2) has 16 ''next" states. depending on the LC-3 instruction being executed during the current instruction cycle. This state carries out the DECODE phase of the instruction cycle described in Chapter 4. If the IRD control signal in the microinstruction corresponding to state 32 is I, the output MUX of the microsequencer (Figure C.5) will take its source from the six bits formed by 00 concatenated with the four opcode bits IR[ I5: 12]. Since IR[ 15: 12] specifies the opcode of the current LC-3 instruction being processed. the next address of the control store will be one of 16 addresses, corresponding to the 15 opcodes plus the one unused opcode. IR[ 15: 12] = I IO I. That is, each of the 16 next states is the tirst state to be carried out after the instruction has been decoded in state 32. For example. if the instruction being processed is ADD. the address of the next state is state I. whose microinstruction is stored at location 00000 I. Recall that IR[ 15: 12] for ADD is 000 I.
If, somehow, the instruction inadvertently contained !RI I5: 121 = I IOI, the unused opcode, the microarchitecturc would execute a sequence of microinstructions. starting at state 13. These microinstructions would respond lo the fact that an instruction with an illegal opcode had been fetched. Section C.6.3 describes what happens.
Several signals necessary to control the data path and the microsequcncer arc not among those listed in Tables C. l and C.2. They are DR. SR 1. BEN. l'.\/T. and R. Figure C.6 shows the additional logic needed tn generate DR. SR I. and BEN.
The INT signal is supplied by some event external to the normal instruction processing, indicating that normal instruction processing should be interrupted and this external event dealt with. The interrupt mechanism was described in Chapter 8. The corresponding flow of control within the microarchitecture is described in Section C.6.
The remaining signal. R. is a signal generated hy the memory in order to allow the LC-3 to operate correctly with a memory that takes multiple clock cycles to read or store a value.
IR[11:9] IR[S:6]
c-----c,c'> SR1
C.5 Merno~y-Mapprrl 1/0 575
Suppose it takes memory five cycles to read a value. That is, once \1AR contains the address to be read and the microinstruction asserts READ. it will take five cycles before the contents of the specified location in memory are available to be loaded into MDR. (Note that the microinstruction asserts READ by means of two control signals: MIO.EK/YES and R.W/RD: see Figure C.3.)
Recall our discussion in Section C.2 of the function of state 33, which accesses an instruction from memory during the fETCH phase of each instruction cycle. For the LC-3 to operate rnrrectly, state 33 must execute five times before moving on to state 35. That is, until MDR contains valid data from the memory location specified by the contents of MAR, we want state 33 to continue to re-execute. After five clock cycles. the memory has completed the “read,” resulting in valid data in ‘\!!DR. so the processor can move on to state 35. What if the microarchitecture did not wait for the memory to complete the read operation before moving on to state 35” Since the contents ofMDR would still he garbage, the microarchitccture would put garbage into IR in state 35.
The ready signal (R) enables the memory read to execute correctly. Since the memory knows it needs five clock cycles to complete the read, it asserts a ready signal (R) throughout the fifth clock cycle. Figure C.2 shows that the next state is 33 (i.e., 100001) if the memory read will not complete in the current clock cycle and state 35 (i.e.. I 000 I I) if it will. As we have seen, it is the job of the microsequencer (Figure CS) to produce the next state address.
The IO microsequencer control hits for state 33 arc as follows:
With these control signals. what next state address is generated by the microse- quencer? For each or the first four executions of state 33, since R = 0. the next state address is I00001. This causes state 33 to be executed again in the next clock cycle. In the fifth clock cycle, since R = I, the next state address is I00011, and the LC-3 moves on to state 35. Kote that in order for the ready signal (R) from memory to be part or the next state address, COND had to he set to 00 I, which allowed R to pass through its four-input AND gate.
C.5 Memor~-Mapped 1/0
As you know from Chapter 8. the LC-3 ISA performs input and output via memory-mapped 1/0, that is, with the same data movement instructions that it uses to read from and write to memory. The LC-3 does this by assigning an address to each device register. Input is accomplished hy a load instruction whose effective address is the address of an input device register. Output is accomplished by a store instruction whose effective address is the address of an output device register. For example, in state 25 of Figure C.2, if the address in MAR is xFE02,
576 appendix c T’ne IV,icroarchiteclurc of the LC-3
li\l~tlla,;J:fruth Table for Address Control Logic
MAR MIO.EN R.W MEM.EN IN.MUX LD.l
‘ • 101
~~~-~[I_R[~1-5~12_11_ _ _ ~ . / , 1
Veclor<-x01
41
Write ' "'------'
MDR<-PSR
PSR[15]<-0 [PSR[15)]
I' t
To 37 To45
43 MDR<-PC-1
', MAR, SP<-SP-1)
I
~
T:-i 1H
1
(
MAR. SP<-SP-1_)
0/___._i~'8. Write
578
appendix c
fhe Microarchitecture of the LC 3
Figure C.8
[70]
[10:0]
180]
[5:o:
[4,0]
IR
ACJW 1,1.JX
GateMARMUX
GatcPC
LD PC------!>
2 15
GatePc-·
)R
LO REG
SR2
E
ALLJK
ADDR2MUX
~
LD.IR
16 ,(15 16 16
0 16
MAFlMUX \
l6
,,/41
316
16
16
-1
16
REG FILE
SR2 SR1 OUT OUT
16
..
CONTROL LOGIC
‘SR2MUX/
6•
~~–+-0s KBSR —–~——
INMUX
LC-3 data path, including addit’onal structLff~s for interrupt co·1frol
l L->~ –~
E
MEMORY MEM.EN
DSR
!
1
INPUT
– OUTPUTr-:
/
3
12’I Ao GatePSR yZ-GateALU
16 GateSP
[2.0:
DOR I
I I
SR·
f f’ VcccrMLJX ,ITT;I INTV .
LO Vector
C.6.1 Initiating an Interrupt
While a program is executing. an interrupt can be requested by some external event so that the normal processing of instructions can be preempted and the con- trol can turn its attention to proces,ing the interrupt. The external event reyuests an interrupt hy asserting its interrupt request signal. Recall from Chapter 8 that if the priority level of the device as,erting its interrupt request signal is higher than the priority level of the currently executing program. INT is asserted and INTV is loaded with the appropriate interrupt vector. The microprocessor responds to I>IT by initiating the intcm1pt. That is. the processor puts itself into supervisor mode. pushe, the PSR and PC of the interrupted process onto the supervisor stack. and loads the PC with the starting address of the intenupt service routine. The PSR contains the privilege mode PSR[ 151, priority level PSR[ 10:8 J. and condition codes PSR[2:0] of a program. It is important that when the processor resumes exe- cution of the interrupted program. the privilege mode, priority level. and condition codes are restored to what they were when the interrupt occurred.
The microarchitecture of the LC-J initiates an interrupt as follows: Recall from Figme C.2 that in state I8. while MAR is loaded with the contents of PC and PC is incremented. INT i, tested.
State 18 is the only state in which the processor checks for interrupts. The reason for only testing in state 18 is ,traightforward: Once an LC-3 instruction starts processing. it is easier to let it finish its complete instruction cycle (FETCH, DECODE. etc.) than to interrupt it in the middle and have to keep track of how far along it was when the external device rcqncsted an interrupt (i.e., asserted INT). If l’.’
If INT = I. a I is produced at the output or the AND gate. which in turn makes the next state address not I00001, corresponding to state 33. but rather I IOOO I. corresponding to state 49. This starts the iriitiation of the interrupt (see
Figure C.7).
Several functions are performed in state 49. The PSR, which contains the
privilege mode, priority level, and condition codes of the interrupted program, are loaded into MOR. in preparation for pushing it onto the Supervisor Stack. PSR[ 15] is cleared. reflecting the change to Supervisor mode, since all inte1n1pt service routines aecu\e in Supervisor mode. The 3-bit priority level and 8-bit inteITupt vector (INTV) provided by the interrupting device are recorded. PSR[10:8] is loaded with the priority level. The internal register Vector is loaded with INTV.
C.6 lmerrupt and !::x:eption Control 579
580 appendix c The ~Jl”croa1-c1itectu1-e of the LC-3
Finally, the processor must test the old PSR[ 15J to see which stack R6 points to before pushing PSR and PC
I f the old PSRI 151 = 0, the processor is already operating in Supervisor mode. R6 is the Supervisor Stack Pointer (SSP). so the processor proceeds immediately to states 37 and 44 to push the PSR of the interrupted program onto the Supervisor Stack. IfPSR[15] = I, the interrupted process was in User mode. In that case, the USP (the current content:, of R6) must be saved in Saved_USP and R6 must be loaded with the contents of Savcd_SSP before moving to state 37. This is done in state 45.
The control flow from state 49 to either 37 or 45 is enabled by the 10 microsequencer control bits, as follmvs:
,.–,_,…._r,•r-, ~.C;c\_U • :_ v •,}
If PSRI 15 I = 0. control goes to state 37 (] 0010 I): if PSR[ 15] = I, control goes to state 45 (I 01101 ).
In state 37, R6 (SSPJ is decremented (preparing for the push), and MAR is loaded with the address of the new top of the stack.
In state 41, the memory is enabled to WRTTE (MIO.E’.’1/YES, R.W/WR). When the write completes, signaled by R = I. PSR has been pushed onto the Supervisor Stack, and the flow moves on to state 43.
In state 43, the PC is loaded into MOR. J\ote that state 43 says MDR is loaded with PC-I. Recall that in state 18, at the beginning of the instruction cycle for the interrupted instruction. PC was incremented. Loading MOR with PC-1 adjusts PC to the correct address of the interrupted instruction.
In states 47 and 48. the same sequence as in states 37 and 56 occurs, only this time, the PC of the interrupted process is pushed onto the Supervisor Stack. The final task to complete the initiation of the interrupt is to load the PC with
the starting address of the interrupt service routine. This is carried out by states 50, 52, and 54. It is accomplished in a manner similar to the loading of the PC with the starting address of a TRAP service routine. The event causing the INT request supplies the 8-hit interrupt vector INTV associated with the interrupt. similar to the 8-bit trap vector contained in the TRAP instruction. This interrupt vector is stored in the 8-bit register INTV, shown on the data path in Figure C.8.
The interrupt vector table occupies memory locations x0100 to x01FF. In state 50, the interrupt vector that was loaded into Vector in state 49 is added to the base address of the interrupt vector table (x0 100) and loaded into MAR. In state 52. memory is enabled to READ. When R = 1, the read has completed and MOR contains the starting address of the interrupt service routine. In state 54. the PC is loaded with that starting address, completing the initiation of the intem1pt.
It is important to emphasize that the LC-3 ,upports two stacks, one for each privilege mode. and two stack pointers (USP and SSP), one for each stack. R6 is the stack pointer and is loaded from the Saved_SSP when privilege changes from User mode to Supervisor mode, and from Saved_USP when privilege changes from Supervisor mode to User mode. Needle;,;, to say, when the Privilege mode
C.6 Inten·upt and Exception Central 581 changes, the current value in R6 must be stored in the appropriate “Saved” stack
rointer in order to be available the next time the privilege mode changes back.
C.6.2 Returning from an Interrupt, RTI
The interrupt service routine ends with the execution of the RTI instruction. The job of the RTT instruction is to re,tore the computer to the state it was in when the intem1pt was initiated. This means restoring the PSR (i.e.. the privilege mode, priority level, and the values of the condition codes N, Z, P) and restoring the PC. Recall that these values were pushed onto the stack during the initiation of the interrupt. They must, therefore, be popped off the stack in the reverse order.
The first state after DECODE is state 8. Herc we load the \1AR with the address of the top of the SLrpervisor Stack, which contains the last thing pushed (that has not been subsequently popped)-the state of the PC when the interrupt was initiated. At the same time, we test PSR[ 15] since RTI is a privileged instruc- tion and can only execute in Supervisor mode. If PSR[ 15] = 0. we can continue to carry out the requirements of RT!.
PSR[lS] = 0 ; RTI Completes Execution
States 36 and 38 complete the operation of restoring PC to the value it had when the interrupt was initiated. In state 36, the memory is read. When the read is completed, MDR contains the address of the instruction that was to be processed next when the interrupt occurred. State 38 loads that address into the PC.
States 39, 40, 42, and 34 restore the privilege mode, priority level, and con- dition codes (l\, Z, P) to their original values. In state 39, the Supervisor Stack Pointer is incremented so that it points to the top of the stack after the PC was popped. The MAR is loaded with the address of the new top of the .stack. State 40 initiates the memory READ: when the READ is completed, MDR contains the interrupted PSR. State 42 loads the PSR from MDR, and state 34 increments the stack pointer.
The only thing left is to check the privilege mode ofthe interrupted program to sec whether the stack pointers have to be switched. In state 34, the microsequencer control bits are as follows:
NC.
If PSR[l5] = 0, control flows to state 51 (] l00l l) to do nothing for one cycle. If PSR[ 151 = l, control flows to state 59 where R6 is saved in Saved_SSP and R6 is loaded from Saved USP. In hoth cases control returns to state 18 to begin processing the next instruction.
PSR[lS] = 1 ; Privilege Mode Exception
If PSR[ 15] = l, the processor has a privilege mode violation. It is attempting to execute RTI while the processor is in User mode, which is not allowed.
582 appenrlix c rhe rv1icroarchitecture o+tre LC:-3
The processor responds to this situation by pushing the PSR and the address of the RT! instruction onto the Supervisor Stack and loading the PC with the starting address of the service routine that handles privilege mode violations. The processor does this in a way very similar to the mechanism for initiating interrupts.
First. in state 44, three functions are performed. The Vector register is loaded with the 8-bit vector that points to the entry in the interrupt vector table that contains the starting address of the Privilege mode violation exception service routine. This 8-bit vector is xOO. The MDR is loaded with the PSR of the program that caused the violation. Third, PSR[ 15] is set to 0. since the service routine will execute with Supervisor privileges. Then the processor moves to state 45, where it follows the same tlow as the initiation of interrupts.
The main difference hetween this flow and that for the initiation of interrupts llow comes in state 50, where MAR is loaded with xOl’Vector. In the case of interrupts, Vector had previously been loaded in state 49 with INTV, which is supplied by the interrupting device. In the case of the privilege mode violation,
Vector was loaded in state 44 with xOO.
Two other minor differences retlect the additional functions performed in
state 49 if an interrupt is initiated. First. the priority level is changed. based on the priority of the interrupting device. We do not change the priority in hand- ling the privilcgc mode violation. The service routine executes at the same priority as the program that caused the violation. Second. a test to determine the privi- lege mode is performed for an interrupt. This is unnecessary for a privilege mode violation since the processor already knows it is executing in LIser mode.
C.6.3 The Illegal Opcode Exception
At the outset of Section C.6, we said the LC-3 ISA specifics two exceptions. a privilege mode violation and an illegal opcode. The privilege mode violation, as you have just seen. occurs when the processor tries to execute the RT! instruc- tion while in User mode. The illegal opcode exception occurs if the instruction being processed specifies the undefined opcode (i.e .. I IO 11 in hits [ 15: 12] of the instruction. The action the processor takes is very similar to what happens when a privilege mode exception is detected. That is, the PSR and PC of the program are pushed onto the Supervisor Stack and the PC is loaded with the starting: address of the Tllegal Opcode Exception service routine. That initiates the service routine. From there, the service routine does whatever has been specified as the corrective action when an illegal opcode is detected.
The fact that the processor is in state 13 is enough to know that au illegal opcode is being processed. The reason: the only way it could get there is via the IR decode state 32. State 13 starts the initiation of the exception. State 13 is very similar to state 49. which starts the initiation of an interrupt. and state 44. which starts the initiation of a privilege mode violation. As with states 49 and 44, the Vector register is loaded in preparation for vectoring to the Interrupt Vector Table to find the starting address of the service routine. The exception vector in this case is xOI. As with states 49 and 44, state 13 sets the Privilege mode to Supervisor (PSRll5] <- 0). since the service routine executes in Supervisor mode. Also like
l
C.7 Central Stan: 583
those states, it loads the PSR into the MDR to start the process of pushing the PSR onto the Supervisor Stack.
Like state 44, it does not change the priority of the running program. since the urgency of handling the exception is the same as the urgency of executing the program that contains it. Like state 49. it tests the Privilege mode of the program that contains the illegal opcode, since if the currently executing program is in User mode, the stack pointers need to be switched as was described in Section C.6.1. Like state 49, the processor then rnicrobranches either to state 37 if the stack pointer is already pointing to the Supervisor Stack, or to state 45 if the stack pointers have to be switched. From there, the initiating sequence continues in states 37, 41, 43, etc.. identical to what happens when an interrupt is initiated (Section C.6.1) or a privilege mode exception is initiated (Section C.6.2l. The PSR and PC are pushed onto the Supervisor Stack and the starting address of the service routine is loaded into the PC. completing the initiation of the exception.
C.7 Control Stare
Figure C.9 completes our microprogrammed implementation of the LC-3. It shows the contents of each location of" the control store, corresponding to the 49 control signals required by each state of the state machine. We have left the exact entries blank to allow you. the reader. the joy of filling in the required signals yourself. The solution is available from your instructor.
584
appendix c Pie ri11 icroarchiteclun' of the LC-3
§'
,
§
"
'''
oooooc (Slate C1 000001 (S1ato 11
OC001C· ($late 21 OCOOI I(Slate 3!
~:~·;::::: :: OC011 0(Slate 6i
OC0111 (S!atc7\ ~:~::::::~:
0(.1010 (StrttP. D;
0C1011 (State 11: Ot11r10 (StatP. 1?;
OC1101 (State 13}
f-+-~~-+~~---~+-+-+--l-l-+-+-++-+--l--+-++-+--l-1-1--+--+~--ll-l--+---+-+~-l-+-~+-,--+--I
'
.
> – + – – – – + – – – – -‘-+-+-+–1-f-+-+–+–t-i’-+–f–+–+–+-+-+-l-t—-+—-;–+—11-t—+—+-+–l-+—+–+–I
‘
> -+ —-+ —–:-+-+-+–l-l-t–+-+-+-‘t—-+—+–l-l-t–+-+-+—1–+–+-+—-1–+-+—+-+–1–+-CI ‘
f-+—-+-~. —~++-+–l-t-+-~+–+-,-t-+-+-++-+-+-t-+–+–+—11-1–+—+-+–t-+-~+–+—,!
1-+—+—.—+-+-+-+-+-++—t+—-t-l-t–f–t-t-t-+-+-+-+—+—+–+-+–+—t—t–l-+—+–+-,
> – + – – – – + – – – – -·-+-+-+–l-l-+—+—+–l-+-+–+–+-+–+-l-+—-+—+—11—t—+—+–+–I-+—+—+-
‘
‘
>-+—-+——+-+-+–t-l-+–+-+-+–1-1-+-+–+–+-+–+-l-t–i– –.- +-+-+–t–+-+–+– ‘
f-+—-+——+-+-+–1-1-+—+–+-+–l-l-+-+–+–+-+–+-1-+—-+—+–1-t—+—+–+–1-+—+-+–+-
‘
,-~ t-+—+—-.–+-+-+-+-+-+—t—t—t-t–t–t–t–t-+-+-+-+-+—+—+–+-+–+—t–t–+’ –+—+-+-+-1
‘ +-+—+-+–!–+– f-+-~–+—–~’+-+-+–l-l-+–+-+-+–l-l-+-+-++-+-+-1-+–+–+–1-1–+—,r–+-+–1-+-~+–+-
”
>-+—-+——+-+–+–l-l-+-~+-+–+–l-l-+-+-++-+—1-1-+—,—+–1-t—+—+–+–1-+—+-+–++-
oc111 O (S1a1c 14: ‘ ‘ I OC11’1 (Stale 15:,
t-+—+—.—+-+-+-+–+–t–t–t—t-t–t–t–t-t–+-+-+-+-+—–+–+-+–+—t–t–+.–+—t-+-+-1
01DCOO(Suk rn:, 010001 (Stc1.1e 17;,
01-JC”O (Suk 18;,
01,Jc· 1 (Stale 19)
01-J1001s1~1″20·,
IJ1·J1U1 (Stale.21;, 001 -o 1s1a1c> 22·, )1oJ1 · 1 (Stal,J 23)
011000 {State 24′,
J11[.J1 {St.itc 26’1 011o·o!State 26)
()11o· 11~1-,1,, ~·7:,
J111:m1s1a1e2ll,1 ::;:~:::::::::·:
)111’ 1 (State 3·
“\!JOJO (Stata ~J, OJWI (Stat? 33, ‘C1u’.·,, J (St;:a 3,11 OJOl 1(State 351
“08110 (Stc11e 36,1
·0::,1::,1 IS’.ate 37′,
·001101S·catP. 381
· 00111 15:ate 39·,
. -, >-+—-+——+-+–+–l–+-+–+–+–+–l-l-t–+–+–+-+–1-1——+–1-t—+—+–+—+—+-+–+–I
‘ ”
-,–~—-1–t–+–+—‘
• l ——.– t-+—+——+-+-+-+–+–t–t–t—t-+–+–+–+-+-+-+-+-+——+-+-+-+–+—1—.-+—+-+–t-7
‘
f-+—-+——+-+-+–l–+-+–+–+–+–l-l-+-+–+–+-+–+-1——+–1-+–+—+—-+—+-+–+–I
‘
I– -·r,–+—–~+-+–+—t-t-+–t-+-+–+—t-t-+–t–t–+–t—t-t——-;—t–t-t—-t–+-+—–t—-t—t—t—t .
1-+—+——+-+–+-+-+–+–+—t—t-l–l–l–+-+–+-+-+—-+–+–+-+—+—1—-+—+-+–t-7
‘
f-+—-+——+-+—/-f-+-+-++-+–1-f-+-+-+-t–+-+-f-,—+~-+–f-+-~+–+–,–,–+–+-+-+–/ t-t—~+——+-+–+-+–+–t—t—t—t-t–t–t–t-+–+-+-+–+—+–+-+–+-+—+—-t—-t–t–t—t-7
f-+—-+——+-+—1-1-+-+–+–+-+–i-l-+-+–+—+–+-I-+-‘-+–+–!-+–+—+-‘–+-+–1—+-+-C
f-+—-+——+-+—l-t-+-+-++-+–1-1-+-+-+–+-+-1-+–+–+–l-+–+–+–‘-1-t–+-+-+–I
f–t—-+——+-+–+,—t-t—t–t–+–+–+–t-t-+–t-+—t–,-t-+—t-~-+–t-t—-t—+–i-,–t–+—-t–+-+—t
f-+—-+-,—-~+-+–+–1-1-+-+-++-+–I-H +-+—l–+-+–+-,-f-,–t-+–+—11-1–++~-+–I-H c–t—-+—,–,—, – —f– -t· – -,-t—t–t–+–J—t—t—-t–t,—- ·–,·- -i——:—–i·-,-t—-t–+–J—1 l-+—-+——+-+-+-+-+–+–t—t—t—t-l–l–t–+-t—+-~++–+–+-+-+-+-+—t—-t—t–t–t—t–t—t-7
f-t———-+-+-+–1-f-+-+-++-+–/-l-+–‘–+–+–f-+–+-+-+–f-+–+–+-+-,–f-t–+-+-+–/ t-t———-+-+-+-+-+–+–t—t—t—t-t—t—t-_,__+–+–+—+—+—+—t—-t—t–t–t–t-t–t—-t
‘
‘
1-+———+-+-+-+-+–+–i–l—l—l-1—t—+-+-+–+-+—+–+—+—+—-l—l-,–l–1–,–t–t—,
‘
‘ 0 · 0 IO 15′.cite .’.:?i ‘ – ‘ ·wo1113:atec.Ji ,-+—-+——+-+-+–l-l-+-+-+-+—l-r–+-+-++-+–1-1-+-~-~—,–+–+~-+-+–+-+—-+—l,o 1001S:c11e.:41
·o·mo 1s·ca1~ 401 H-r~-~~~,–J-+-+-~H-++-HH-~–t-+-+-~H~+–.–l.~-~H-r+-~-+-+-~H—, ·c-co; ‘””” “i
r·-, – t– -c– — — . -t—,–+——+–+–+–+-+-+-+-+-+–t–t—f—t-l–l–t–+-+-+——–+–+–+-+–t-+–+–,—t—t—+—–l,0·1101S:ate’-EI
‘
1w 101 1:,:a1e I-s1 ‘ ‘ ‘ 1w 111 ,s,ff.,-‘-71
,–t—+~—–+-++-+–l–+-+-++-+–l-l-+-+-++-+–i–,——-+-1–+–+-+—1-l—+–,,100001s1a:e.:.JJ1
‘ ‘ 1100011Strn,.cR1 +–+—-+——+–+-+–+–+—+–i–+—t—t-l–1–1–+-+-+—+–+—+–+—+–+–+—-t—+–+—–t
f-f—+~—–+–+-+–1-f-+–++-+–/-f-+-+-++–i—/-~-+–+~-+-f–t–+-+—/i—++–, 110011 ,Sta:c,51I
‘
·–,–,——-,—,—,—-1-,– .– +–t— — —–+—t—t-t—t–t—t–J—-t—;-t–t-~f-T·- -T- l–t–,–,–t——-+–+—+-+-+–t–+–+—t—t—t,–t–t–t–+–+-+–+—+—+-+-+-+–t-+–+—-t–,–l–+–t–1 ,,c101 ,Sta:c· 531
‘
‘ 11(;111) ,Sta:e 541
1-t—+~~—-+—+—l-l-+-+++-+–l-l-+-+-++—l-l-,–+-~+~-+-l-,–t—-+—ll—++-+–i11c111 ,Sta:e551
. 117,)C,),Staie :,E,) ‘ 1110C1 ,State 57)
111,J11J ,Stato 5b)
.
‘ 1111CO,Slat(; 601
‘
1-1—+——+–t-‘+-+-++-+-+-t-t-+-++-+-+–lt-l-+-++~-+–t–+–~-~-1-+-~++—1-f-+-,:;;;~,~:·::::::;.; +–+—-+——+-+–+-+-+-+-+-+–+–t—t—t-t–t–+–+–+-+-+-+–+—+—+-+~—-+–t—-t—t–+–t–t–11,111·-,State ‘331
Figure C.9 Specification of the control store
”’
‘
‘ ‘ 11001OISta’.e 5Ci
‘
.
11G100 ,Sta:e s~1
111D1′ i’State 59)
D.l Overviem
This appendix is a C reference manual oriented toward the novice C programmer. It covers a significant portion of the language, including material not covered in the main text or this book. The intent of this appendix is to provide a quick reference to various features of the language for use during programming. Each item covered within the following sections contains a brief summary of a particular C feature and an illustrative example, when appropriate.
D.2 CConventions
We start our coverage of the C programming language by describing the lexical elements of a C program and some of the conventions used by C programmers for writing C programs.
D.2.1 Source Files
The C programming convention is lo separate programs into files of two types: source files (with the extension .cl and header files (with the extension .h). Source files. sometimes called . c· or dot-c files, contain the C code for a group of related functions. For example. functions related to managing a stack data structuremightbeplacedinafilenamed.0 t”c1:. c. Each .cfileiscompiledinto an object file. and these objects are linked together into an executable image by the linker.
D.2.2 Header Files
Header files typically do not contain C statements but rather contain function. variable, structure, and type declarations, as well as preprocessor macros. The programming convention is to couple a header file with the ,ource file in which the declared items are de{iHed. For example, if the source file 2tJco. c con- tains the definitions for the functions pri:,t’:, c.c·a.r,f. c;e,:h,sr, and putcc,a,·. then the header file s s d i o . ;, contains the declarations for these functions. If one of these functions is called from another . c· file. then the s , 6 , ,, . h header file should be irinc, .,d2d lo get the proper function declarations.
The CProgramming ~anguage
586 appe11dix cl The C ProJt?ITtrirg Langua·Je D.2.3 Comments
In C. comments begin with the two-character delimiter – and end with ‘ . Comments can sp,m multiple lines. Comments within comments are not legal and will generate a syntax error on most compilers. Comments within strings or character literals are not recognized as comments and will be treated as part of the character string. While some C compilers accept the C++ notation for comments ( ), the A’.\!SI C standard only allows for * a n d ‘ .
D.2.4 Literals
C programs can contain literal constant values that arc integers. floating point values. characters. character strings, or enumeration constants. These literals can be used as initializers for variables. or within expressions. Some examples arc provided in the following subsections.
Integer
Integer literals can be expressed either in decimal. octal. or hexadecimal notation. Tfthe literal is prefixed by a.· (zero). it will be interpreted as an octal number. If the literal begins with a , , it will be interpreted as hexadecimal (thus it can consist of the digits O through 9 and the characters a through f. Uppercase A through F can be used as well. An unprefixed literal (i.e.• it doesn·t hegin with a ··or,..,.) indicates it is in decimal notation and consists of a sequence of digits. Regardless or its base. an integer literal can be preceded by a minus sign. — , to indicate a negative value.
An integer literal can be surlixed with the letter I or L to indicate that it is of type _·• ‘:. An integer literal suftixcd with the letter u or U indicates an unsigned value. Refer to Section D.3.2 for a discussion or , and·
types.
The first three examples that follow express the same number. 87. The two
last version:-, express it as an – -,.,,:- -;_=–~–=-value and as a~:: –
value.
‘ -‘. ~
/* i*
/*
/*
/* ,’*
87 in decimal *’ 87 ln hexadecirr.al *! 87 in c;cta1 */
·24 in deci’.”!la::_ *,’ -2·) in oc:tal *’ -36 in hexadecimal *
Floating Point
floating point constants consist nf three parts: an integer part, a decimal point. and a fractional part. The fractional part and integer part are optional. but one of the two must be present. The number preceded hy a minus sign indicates a negative value. Several examples follow:
1 * expresses tl-_”2 number 1. ‘.) */
Floating point literals can also be expressed in exponential notation. With this fonn. a floating point constant (such as 1.6131231 is followed by an e or E. Thee or E signals the beginning of the integer exponent, which is the power of
IO by which the part preceding the exponent is multiplied. The exponent can be a negative value. The exponent is obviously optional. and if used, then the decimal point is optional. Examples follow:
I* 6.023 * ~OA23 */ I* 454.323 * lC”'(-22) * ! I* 5.0 * lOAU *,
By default. a floating point type is a •, · or double-precision floating point number. This can be modified with the optional suffix f or I”. which indicates a =~-:.c,r. or single-precision floating point number. The suffix I or L indicates a
,o ·· (see Section D.3.2).
Character
A character literal can be expressed hy surrounding a particular character hy single quotes, e.g.. _,. This converts the character into the internal character code used hy the computer, which for most computers today. including the LC-3. is ASCII.
Table D. l lists some special characters that typically cannot be expressed with a single keystroke. The C programming language provides a means to state them via a special sequence of characters. The last two forms. octal and hexadecimal. specify ways of stating an arbitrary character by using its code value, stated as either octal or hex. For example, the character ·s·. which has the ASCII value of 83 (decimal), can be stated as ‘\0123′ or ·\x53’.
String Literals
A string literal within a C program must he enclosed within double quote char-
acters, ,.. String literals have the type ,·
Character
ne\Nlir1e
ho1’i1ont.al L::ib vertical tab bacbpace
:-.arric1gr ret11rn form..ceed
aucJ’oie aler~ backslash
que~tior. rrark ‘: single quole ·
double qJo:e ·
octal mw11Je·· hexarleci,11al numbe1’
and space for them is allocated in
Sequence
D.2 C Convent.io·1s 587
588
appe-1dix d The C Prograrnrnir,g L.u.nguaqc
a special section of the memory address space reserved for literal constants. The termination character ‘ ,,· is automatically added to the character string. The following are two examples of string literals:
[ ! C 1 .. _c,:._,n , ‘.)’—.:.:
String literals can be used to initiali,.e character strings, or they can be used wherever an object of type ,,nc· * is expected, for example as an argument to a function expecting a parameter of type c L n , . String literals, however, cannot he used for the assignment of arrays. For example. the following code is not legal in C.
Enumeration Constants
Associated with an enumerated type (see Section D.3.1) are enumerators, or enu- meration constants. These constants are of type ·· r,,. and their precise value is defined hy the enumerator list of an enumeration declaration. In essence, an enumeration constant is a symbolic, integral value.
D.2.5 Formatting
C is a freely formatted language. The programmer is free to add spaces, tabs, carriage returns, new lines between and within statements and declarations. C programmers often adopt a style helpful for making the code more readable, which includes adequate indenting of control constructs, consistent alignment of open and close braces. and adequate commenting that does not obstruct someone trying to read the code. See the numerous examples in the C programming chapters of the book for a typical style of formatting C code.
D.2.6 Keywords
The following list is a set of reserved words, or keywords, that have special meaning within the C language. They are the names of the primitive types. type modifiers. control constructs. and other features natively supported hy the lan- guage. These names cannot he used by the programmer as names of variables. functions. or any other object that the programmer might provide a name for.
Ca.SP.
e:-:xte::.–n con.st f2-u2.–::._
contir.,.1..:c for d e f au~:..:
de
:nt
_,:::19
r2QlsleI : f rr:t’_:rn
.short
1.ElS i~rned ‘JO j ri_
0.3 Tqpes
In C. expressions, functions. and objects have types associated with them. The type of a variable, for example, indicates something about the actual value the variable represents. For instance, if the variable'”””‘””‘ is of type-_,-,·. then the value (which is essentially just a bit pattern) referred to hy , will be interpreted as a signed integer. In C. there arc the basic data types. which are types natively supported by the programming language, and derived tvpes. which are types based on basic types and which include programmer-defined types.
D.3.1 Basic Data Types
There are several predefined basic types within the C language: in:, fJ. du.iJlc”, and ,c:L,i. They exist automatically within all implementations of C. though their sizes and range of values depends upon the computer system being used.
int
The binary value of something of c_nc type will be interpreted as a signed whole number. Typical computers use 32 bits lo represent signed integers. expressed in 2’s complement form. Such integers can take on values between (and including) -2. 147.483,648 and +2. 147,483,647.
float
Objects declared of type : 1 na cc represent single-precision floating point numbers. These numbers typically, but not always, follow the representations defined by the IEEE standard for single-precision floating point numbers. which means that the type is a 32-bit type. where 1 bit is used for sign, 8 bits for exponent (expressed in bias-127 code). and 23 bits for fraction. Sec Section 2.7.1.
double
Objects declared of type dc:L”Ll~ deal with double-precision floating point num- bers. Like objects of type f soc
/* Specifier*/
” ” ‘ ‘ ‘ . ; m ‘ / ~ _ ~ : c , ~ r – ~ __ ._-=:::–
causes Peng,., 1 ,-_ to be 3. c.,_. :’i:_,< to be 4, and so forth.
D.3.2 Type Qualifiers
The basic types can be modified with the use or a type qualifier. These modifiers
alter the basic type in some small foshion or change its default size. signed.r unsigned
The types j n•. and ,-,-,.•n can be modified with the use of the s -2:·, • c and'·,,_ :c,;-,"'" qualifiers. By default, integers arc signed: the default on characters depends on the computer system.
For example. if a computer uses 32-bit 2's complement signed integers. then a s,_o,·_,,n 1::::: can have any value in the range -2.147,483.648 to +2,147,483.647. On the same machine. an _,,,,,jq:•"'.-: 1:c-- can have a value in the range Oto +4.294,967.295.
- ,--"
/* L~e signed modifier is redu~dant */
!* fo:::2es t~~ chart~ be in~erprcted as~ signed valce */
* tt.e c::a::: w~ll be interpreted as an 1..,-_ns.:.g:-:.ed ' I a ] u e " /
1.ongF short
The qualifiers ::o:•c; and 0 •lnc-- allow the programmer to manipulate the physical size of a basic type. For example, the qualifiers can be used with an integer to create sL_-, :·,1- and' "'"9 i ,.,_.
It is important to note that there is no strict definition of how much larger one type of integer is than another. The C language states only that the size of a :,c-_ i_ :c- is less than or equal to the size of an:"", which is less than or equal
to the size of a -,,,, Inc. Stated more completely and precisely:
New computers that support 64-bit data types make a distinction on the::,- - qualifier. On these machines, a J:;i,:• 'c--t might be a 64-bit integer. whereas an _, ,c,- might be a 32-bit integer. The range of values of types on a particular computer can be found in the standard header file.,'- __ - _ .. On most U:--IIX systems, it will be in the _,._, ·• -cc: ,·.de, directory.
The following are several examples of type modifiers on the integral data types.
'--l;
The lc-r,:i and c::.cr,. qualifiers can also be used with the floating type :,:d,:"'
lo create a lloating point number with higher precision or larger range (if such
a type is available on the computer) than a -'"""'--~- As stated by the A:"!SI C specification: the size of a "J ,_:-a_,:_ is less than or equal to the size of a Ct·>:,,i2. which is less than or equal to the size of a _ c,,:_
canst
A value that does not change through the course of execution can be qualified with the -.:,,r_,,t qualifier. For example,
By using this qualifier, the programmer is providing information that might enable an optimizing compiler to perform more powerful optimizations on the resulting code. All variables with a con,sc qualifier must be explicitly initialized.
D.3.3 Storage Class
Memory objects in C can be of the static or ,wtomatic storage class. Objects of the automatic class are local to a block (such as a function) and lose their value once their block is completed. 13y default, local variables within a function arc of the automatic class and are allocated on the run-time stack (sec Section 14.3.1 ).
Objects of the static class retain their values throughout program execution. Global variables and other objects declared outside of all blocks are of the static class. Objects declared within a function can be qualified with the”- t ,, – ‘ : qualifier to indicate that they are to be allocated with other static objects, allowing their
‘:: __
,;:: .
0 . 3 T ypes 591
592
appendix d The C Proy1-a1111ir1g LangJage
value to persist across invocations of the function in which they arc declared. Fm example.
-,—- -·
~- ‘~ c, (.
_j_ —
The value of/ will not be lost when the activation record of=· , ,– is popped off the stack. To enable this, the compiler will allocate a static local variable in the global data section. Every call of the function -,- ..— updates the value of·;.
Unlike typical local variables of the automatic class. variables of the static class are initialized to ,ero. Variables of the automatic class must be initialized by the programmer.
There is a special qua! ilier called 1.·ecc,o. s cc c that can be applied to objects in the automatic class. This qualifier provides a hint to the compiler that the value is frequently accessed within the code and should he allocated in a register to potentially enhance performance. The compiler. however, treats this only as a ,uggestion and can override or ignore this specifier based on its own analysis.
Functions. as well as variables. can be qualified with the qualifier-,,.,,~'”· This qualifier indicates that the function’s or variable’s storage is defined in another object module that will be linked together with the current module when the executable is constructed.
D.3.4 Derived Types
The derived types are extensions of the basic types provided by C. The derived types include pointers, arrays, structures. and unions. Structures and unions enable the programmer to create new types that arc aggregations of other types.
Arrays
An array is a sequence ol’ objects of a particular type that is allocated sequentially in memory. That is, if the first clement of the array of type Tis at memory location X, the next element will be at memory location X + ,; ,,,,,,s~ _ . and so forth. Each element of the array is accessible using an integer index. starting with the index 0. That is, the first element of array . , , , is numbered starting at 0. The size of the array must be slated as a constant integral expression (it is not required to be a literal) when the array is declared.
c~~r s7r~ngr~’Jl~; /* ~ec:ares array of 180 characters*/ i~t data[:o:; /* Dec:ares array of 20 integers */
To access a particular element within an array, an index is formed using an integral expression within square brackets,
dat.a;cj :-laCa[i. , St”:…-}Lg-:~,,.
..,..
, _ . ; J –
/* Accesses first element of array data*/ /* The va~iable i must be an in~eqer */
/* x and y must be integers*/
D.3 Types 593
The compiler is not required to check (nor is it required to generate code to check) whether the value of the index falls within the bounds of the array. The responsibility of ensuring proper access to the array is upon the programmer. For example, based on the previous declarations and array expressions, the reference
ctnnc: ,x “‘·’, the value of x + ,.· should be 100 or less; otherwise the reference exceeds the bound, of the array ect ,
Pointers
Pointers arc objects that are addresses of other objects. Pointer types are declared by prefixing an identifier with an asterisk, , . The type of a pointer indicates the type of the object that the pointer points to. For example,
/* v points to an integer*/
Callows a restricted set of operations to be used on pointer variables. Point- ers can be manipulated in expressions, thus allowing ··pointer arithmetic” to be performed.Callows assigment between pointers of the same type. or assignment, of a pointer to 0. Assignment of a pointer to the constant value 0 causes the gener- ation of a null pointer. Integer values can be added to or subtracted from a pointer value. Also, pointers of the same type can be compared (using the relational oper- ators) or subtracted from one another, but this is meaningful only if the pointers involved point to elements of the same array. All other pointer manipulations arc not explicitly allowed in C but can be done with the appropriate casting.
Structures
Structures enable the programmer lo specify an aggregate type. That is, a structure consists of member elements, each of which has its own type. The programmer can specify a structure using the following syntax. Notice that each member element has its own type.
s–::-.rnct t_ag ici
L~/;:e: rneml.>erl.:
t_·_/1-‘2/ :–::e.-mbc”:Y2;
This structure has member elements named ,rembs1_·J_ of type t n:c 1, :nerr.berl of :_ype2, up to ~1en:;:)e:::-F of typeN. ~ember elements can take on any basic or derived type, including other programmer-defined types.
The programmer can specify an optional tag. which in this case is t a.g_ J d. Using the tag, the programmer can declare structure variables, such as the variable x in the following declaration:
A structure is defined by its tag. Multiple structures can be declared in a program with the same member elements and member element identifiers; they arc different if they have different tags.
594 appendix cl Tile C rrograrnm·ng Language
Alternatively. variables can be declared along with the structure declaration. as shown in the following example. In this example. the variable .• :. rc:1 :c :·. ·1’:. 1s declmed along with the structure. The array.,·,·:,~ is declared using the structure tag
c._; l _L – ,_
–i– :-oc r-
/* dcc:J_ares Ztn array of str:.lcture t}:pe var.:..a.bles */
See Section 19.2 for more information on structures.
Unions
Structures arc containers that hold multiple objects or various types. Unions, on
the other hand. are containers that hold a single object that can take on different I predetermined types at various points in a program. For example, the following
is the declaration oC a union variable
The variable 7:.::.cc5 ultimately contains bits. These bits can be an integer, double. or character data type. depending on what the programmer decides to put there. For example, the variable will be treated as an integer with the expression
1c-L ccci.•n ·,orasadouble-precisionfloatingpointvaluewith:::.::··
Like the logical AND and logical OR operators. the conditional expression short-circuits the evaluation of expressionB or expressionC, depending on the state of expressi1111A. See Section D.5.4.
D.5.8 Pointer, Array, and Structure Operators
This final batch of operators performs address-related operations for use with the
rI
!
derived data types. “s :
Address Operator Il The address operator is the ‘·· It takes the address of its operand. The operand
must he a memory object, such as a variable. array element. or structure memher.
Dereference Operator
The complement of the address operator is the dereference operator. It returns the i
object to which the operand is pointing. For example. given the following code:
I lt
the expression *;• returns x. When ·r appears on the left-hand side of an assign- I ment operator, it is treated as an !value (see Section D.5.1 ). Otherwise -; evaluates
to the value of x. I
Array Reference
In C, an integral expression within square brackets, [ ] , designates a subscripted array reference. The typical use of this operator is with an oh_ject declared as an array. The following code contains an example of an array reference on the array
Structure and Union References
C contains two operators for rcfctTing to member elements within a structure or union. The first is the dot, or period, which directly accesses the member element of a structure or union variable. The following is an example:
c-:-:1-·1::::-:t ~~-oint ~-~Ipc ‘.
Ir!\” .:nt:
p·; “~e L. 3: p::..xf.’ 1 .
The variable p ~ x e ~ is a accessed using the dot operator.
structure
variable,
and its
member
clements are
The second means of accessing member elements of a structure is the atTow. or ·, operator. Here. a pointer to a structure or union can be dereferenced and a member element selected with a single operator. The following code demonstrates:
pt..i.: 6~p:’._zeJ;
p:_::,_·. >:>-: :–: n t r > X -;- l ;
Here, the pointer variable pr r points to the structure variable ,;, ,x··,.
D.5.9 sizeof
The s i c e c f operator returns the number of bytes required to store an object of the type specified. For example, s” ·c.c0r. en< will return the number of bytes occupied by an integer. Ifthc operand is an array, thens· c,·c" will return the size of the atTay. The following is an example:
in,: list !.'±SJ:
:::: t_ -r ·J<"' t e:,:.aT"r'1.p:_.,:; _t ::::r,e l E t --10..lu.e.~;
i.c~l. va-uf:S;
}i
t·-~,,0 ,,~t sr.ruct:. 2xa.rq.::,lr-:: ::·/l),:::- K'<.a;~'.pl~;
~.izeP_ s:1-z'"::'c-t ,: ~ ist_, ; /* 45 * sizeof(int) */ sizes= sizeof'Exa~ple 1 /* Size of structl:.re */
D.5 Operators 601
602
ld!\tJ~~~:'~ Operator Precedence, from Highest to Lowest. Descriptions of Some Operators are Provided in Parentheses
Precedence Group
1 :highest)
2
3 4
",
6 7 8 9
Associativity I tc r
Operato1·s
to r
11 tor &
!=
10
12
13
14
l'i
16
17 (lov,wst)
I tc 1· I lo r I to r Ito · I to r rto
I
&&
I
?:
D.5.10 Order of Evaluation
r tc ' ++ - - I' to ++
Ip~1:,tf ~ \;( i'~i(Jtl'j)
1- to I r lo
I to 1- I:or I to r I to 1·
·p1·ef\ '.H~-ICtb
* li11:li1·ectio11·, & :aYrPss ot
+ 1.u11211·y':· - 1111a1··(· S i / . f : O f ,:type:1 1t\1:,e cc.s:,
* ,1r.ult:1_; _31_ -,1-J , ad:lit;ry:
%
*= etc.
The order of evaluation of an expression starts at the subexpression in the inner- most parentheses, with the operator with the highest prn-edence. moving lo the operator with the lowest precedence within the same subexpression. If two oper- ators have the same precedence (for example. two of the same operators, as in the expression:" , a .; ), then the associatii'itY of the operators determines the order of evaluation, either from left to right or from right to left. The evaluation of the expression continues recur,ively from there.
Table D.4 provides the precedence and associativily of the C operators. The operators of highest precedence are listed at :he top of the table. in lower numbered precedence groups.
D.5.11 Type Conversions
Consider the following expression involving the operator op.
The resulting value of this expression will have a particular type associated
with it. This resulting type depends on (1) the types of the operands A and B. and
(2) the nature of the operator op.
lf the types of A and B are the same and the operator can operate on that
type. the result is the type defined hy the operator.
When an expression contains variables that arc a mixture of the hasic types.
C performs a set of standard arithmetic conversions of the operand values. In gen- eral. smaller types arc converted into larger types. and integral types are converted into floating types. For example, if A is of type ' 7 ~ and B is of type .. ·. the
+
result is of type clc:1J 1.·s. Integral values. such as L e , ·_, or an enumerated type, are converted to ,,,c (or u;c =>31•·0·_, ,,,,-_ depending on the implememation). The following are examples.
._,,.: ~·-LJ (-·
/* This expressiou
is an i:1.tegcr *i is nn i~teJer *· 1s a float +: is~ ~loat */ is a Cauble */
/* ~his ~xpressicn ~-;· .’* ~~i3 cxpress~on /+ This express~o~ /* Tl:.is express1on
/* ‘l’l:ls is a dct1~le *I
As in case (2) above, some operators require operands of a particular type or generate results of a particular type. For example, the modulus operator 1 only operates on integral values. Here integral type conversions are performed on the operands (e.g., ._na.c is converted to ,cr·c). Floating point values arc not allowed and will generate compilation errors.
If a floating point type is converted to an integral type (which does not happen with the usual type conversion. but can happen with casting as described in the next subsection). the fractional portion is discarded. lf the resulli ng integer cannot be represented by the integral type. the result is undefined.
Casting
The prcigrammer can explicitly control the type conversion process by t_vpe casting. A cast has the general form:
Here the expression is converted into the 11cw-ry1ie using the usual conver- sion rules described in the preceding paragraphs. Continuing with the previous example code:
: 1. ~’his results :.:1 conversicn c,I d1,,_1blc i::1tc:, ctn lr:..teger *,,
0.6 ExpressionsandStatements
In C, the work performed by a program is described by the expressions and statements within the bodies of functions.
D.6.1 Expressions
An expression is any legal combination of constants. variables, operators, and function calls that evaluates to a value of a particular type. The order of evaluation is based on the precedence and associativity rules described in Section D.5. 10.
D.6 Fxpr2ss·ons and Staterrienls 603
604
appendix d The C Programming Language
The type of an expression is based on the indivictual elements of the expression, according to the C type promotion rules (sec Section D.5.11 ). If all the clements of an expression are i :;, types, then the expression is of c::t type. Following are several examples of expressions:
a-ri- c3
D,6.2 Statements
In C, simple statements are expressions terminated by a semicolon, :. Typically, statements modify a variable or have some other side effect when the expression is evaluated. Once a statement has completed execution, the next statement in sequential order is executed. If the statement is the last statement in its function, then the function terminates.
~ * ~ ~ t * ~; /* Twc simp-e state~ents */
D
Related statements can be grouped togcthcrcd into a compound statement, or hiock, by surrounding them with curly braces, ( ]. Syntactically, the compound statement is the same as a simple statement, and they can be used interchangeably.
D.7 Control
L * t-,
/* One cornpou:-id stas:cment */
The control constructs in C enable the programmer to alter the sequential execution of statements with statements that exernte conditionally or iteratively.
0,7,l If
An i f statement has the format
If the expression, which can be of any basic, enumerated, or pointer types, eval- uates to a nonzero value, then the swtemenr. which can be a simple or compound statement. is executed.
a~ b -~ -. /* Executes ~f xis less thar zero*/ See Section 13.2.1 for more examples u f , c statements.
D.7.2 If-else
An i r_ – •0 bs· statement has the format
i::: (~_,xp1_:-2;:..::::~c,n:1 :::1_,1.t-1:-rcip:·,:~ ~
Tf the exprc.1sion. which can be of any basic, enumerated, or pointer type. eval- uates to a nonzero value. then statementI is executed. Otherwise. statement2 is executed. Both statement I and statement2 can be simple or compound statements.
/* Executes if xis less t~an ze~o *I /* Otherwise, this is executed. */
Sec Section 13.2.2 for more examples of ,c -,
D.7.3 Switch
A s,,, cc’c statement has the following format:
__ ,_ – · __ ,;1__-_,,t_,.-:…,.
, statements.
~ace ~c~st-f~xpr~: s t a t.,-:::nc;:’n. ,_- :-.-:7::,.
J\ ,.,·,;i, ::: statement is composed of an expression, which muq be of integral type (see Section D.3.1 ). followed by a compound statement (though it is not required to be compound. it almost always is). Within the compound statement exist one or more c .cs• labels. each with an associated constant integral expres- sion. called const-exprl. const-expr2, const-exprN in the preceding example. Within a sc,;: tch. each c2,”‘” label must be different.
605
606
,wper·di·,: d The C Programniinq Language
When a s,.-,j t,_,,, is encountered, the controlling expression is evaluated. H one of the case labels matches the value of expression, then control jumps to the statement that follows and proceeds from there.
The special case label ,,1ecs1..’: can be used to catch the situation where none of the other case labels match. If the : l e ” a d c case is not present and none or the labels match the value of the controlling expression, then no statements within the “”-; tci, are executed.
The following is an example of a code segment that uses a ,swctch state- ment. The use of the brc,,:, statement causes control to leave the 2 ., , l
Ji-
~
D.7.5 For
A f o r statement has the following format:
The f o r statement is an iteration construct. The ini1ializer, which is an expression, is evaluated only once, before the loop begins. The term-expr is an expression that is evaluated before each iteration of the loop. If the term-expr evaluates to nonzero, the loop progresses; otherwise the loop terminates and control passes to the statement following the loop. Each iteration of the loop consists of the exe- cution of the statement, which makes up the body of the loop. and the evaluation of the reinitia!i~er expression.
The following example is a f o r loop that iterates 100 times.
f~r (x ~ Q; x < JOG; X++i
See Section 13.3.2 for more examples of f o r statements.
D,7_6 Do-while
A cio- w0, i le statement has the format
de,
The c_b · ,.,h ".le statement is an iteration construct similar to the ,,lli, e statement. When a "''"_,_,,,,i -,~ is first encountered, the statement that makes up the loop body is executed first, then the e.\pression is evaluated to determine whether to execute another iteration. If it is nonzero, then another iteration is executed (in other words, .,tatement is executed again). In this manner. a cb-v:il d e always
executes its loop body at least once.
The following c1o· •,;llile loop iterates 100 times.
See Section 13.3.3 for more examples of ,"0-w ,i:l2 statements.
D.7 Control 607
608
appendix d The C P1·ogra-n'lli'"1g Language D.7.7 Break
A brc,:,:: statement has the format:
The):,:·"''"" statement can only he used in an iteration statement or in a ,, ,_"i L : t l statement. It passes control out of the smallest statement containing it to the statement immediately following. Typically, "-"~a:: is used to exit a loop before the terminating condition is encountered.
In the following example. the execution of the,,,, s<· statement causes control to pass out of the f;:, r loop.
Sec Section 13.5.2 for more examples of l:" ~3c: statcmcnts.
D.7.8 continue
A ::c,,:: iLu,a statement has the following format:
The c·c:,1. "'u~ statement can be used only in an iteration statement. It prema- turely terminates the execution of the loop body. That is, it terminates the current iteration of the loop. The looping expression is evaluated to determine whether another iteration should be performed. In a : :·. loop the reinitiaker is also evaluated.
Ifthe cont :m,-s statement is executed. thens: is incremented. and the reinitia/- i;:,er executed, and the loop expression evaluated to determine if another iteration should be executed.
Sec Section 13.5.2 for more examples of::,:::.,.~ statements.
D.7.9 return
A c·c::cT:c statement has the format
The : e t .<1. r statement causes control to return to the current caller function. that is. the function that called the function that contains the, e':x,r: statement. Also, after the last statement of a function is executed. an implicit return is made to the caller.
The e:'·"' e:ss_irce· that follows the nc~..L r . is the return value generated by the function. It is converted to the return type of the runction. If a function returns a value, and yet no rcch1r,1 statement within the function explicitly generates a return value, then the return value is undefined.
D.B TheCPreprocessor
The C programming language includes a preprocessing step that modifies. in a programmer-controlled manner. the source code presented to the compiler. The most frequently used features of the C preprocessor arc its macro substi- tution facility (1*defL:c). which replaces a sequence of source text with another sequence, and the file inclusion facility (i;·:1~Loe). which includes the con- tents of a file into the source text. Both of these are described in the following subsections.
None of the preprocessor directives are required to end with a semicolon. Since #-:leef se1cc and ,, , TJ ,: •• '" '" ' are preprocessor directives and not C statements, they arc not required to be terminated by semicolons.
D.8.1 Macro Substitution
The # d c : i n e preprocessor directive instructs the C preprocessor Lo replace occurrences of one character sequence with another. Consider the following example:
trdi::-f. 1 ne P. B
Here, any token that matches J\ will be replaced by B. That is. the marro A gets suhstitllted with B. The character A must appear as an individual sequence, i.e., the A in APPLE will not be substituted, and not appear in quoted strings, i.e., neither will"/',;'_
The replacement text spans until Lhe end of the line. If a longer sequence is required, the backslash character. \. can be used to continue to the next line.
Macros can also take arguments. They are specified in parentheses immedi- ately after the text to be replaced. For example:
,' ( X : . , % \ ' : [ , I
D.8 The C Preprocesso:· 609
610 o.pperdix d The C Programming Lcrnguage
Here, every occurrence of the macro<' , in the ,ource code will be accompanied
by two value,, as in the following example.
- ·'
The macro ,,E:".",:":'"E'· will be replaced by the preprocessor with the replace- ment text provided in the a::1,,::.:·.-,. and the two arguments A and B will be substituted with the two arguments that appear in the source code. The previous code will be modified to the following after preprocessing:
Notice that the parentheses surrounding X and Y in the macro definition were required. Without them. the macro: ;:·.•p_: :-::E, would have calculated the wrong value.
\Vhile the :Ei::-. 0 · : , : , , : : · macro appears to be similar to a function call. notice that it incurs none of the function call overhead associated with regular functions.
D.8.2 File Inclusion
The ",_nc· ! :-fr, directive instructs the preprocessor to insert the contents of a file into the source file. Typically, the , · ·1 · , . ·> directive is used to attach header files lo C source files. C header.files typically contain ii52°· ,: ‘< and declarations that are useful among multiple source files.
There are two variations of the ~c::c:-.-:,::;o directive:
+t 1• _ ~L«---~-c- <.-3,.c:_ho.:~> i’tii~:=lu:::l~ “p:;:_-:·0;:..:e.:<;, L
The first variation uses angle brackets. < '"· around the filename. This tells the preprocessor that the header file can be found in a predefined directory. usually determined by the configuration of the system and which contains many system- related and library-related header files, such as 0 c i ' s . 1·. The second variation, using double quotes. " ", around the filename. instructs the preprocessor that the header file can be found in the same directory as the C source tile.
0.9 SomeStandardLibrnr~Functions
The AJ\'.SI C standard library contains over 150 functions that perform a variety of useful tasks (for example. 1/0 and dynamic memory allocation) on behalf of your program. Every installation of ANSI C will have these functions available. so even if you make use of these functions. your program will still be portable from one ANSI C platform to another. In this section. we will describe some useful standard library functions.
D.9.1 1/0 Functions
The , ,_J ,,·, _., . header file must be included in any source file that contains calls to the standard 1/0 functions. Following is a small sample of these functions.
getchar
This function has the following declaration:
The function Jee". c,:1ac reads the next character from the standard input device. ors t J, :,. The value ofthis character is returned (as an integer) as the return value. The behavior of ,,-·ccu, is very similar to the LC-3 input TRAP (except no
input banner is displayed on the screen).
Most computer systems will implement :s,crcc,.,;c using buffered 1/0. This
means that keystrokes (assuming standard input is coming from the keyboard) will be buffered by the operating system until the Enter key is pressed. Once Enter is pressed, the entire line of characters is added to the standard input stream.
putchar
This function has the following declaration:
The function '"-'- ,,,c takes an integer value representing an ASCII character and puts the character to the standard output stream. This is similar to the LC-3 TRAP OUT.
If the standard LJutput stream is the monitor, the character will appear on the screen. However. since many systems buffer the output stream, the character may not appear until the system's output buffer is.flushed, which is usually done once a newline appears in the output stream.
scanf
This function has the following declaration:
The function sc,.,nf is passed a format string (which is passed as pointer to the initial character) and a list of pointers. The format string contains formal specifications that contrnl how sc,nf will interpret fields in the input stream. For example. the specification \5 causes scac'. to interpret the next sequence of non- white space characters as a decimal number. This decimal is converted from ASCII into an integer value and assigned to the variable pointed to by the next pointer in the parameter list. Table D.5 contains a listing of the possible specifications for use
with c >,nf. The number of pointers that follLJw the format string in the parameter list should c01Tespond to the number of format specifications in the format string. The value returned by ,cc,,:of corresponds Ill the number of variables that were successfully assigned.
D.9 Sorre Standard l_ibt·ary Functions 611
612
appendix d
The C Programming Language
printf
•i)itjrj,J.Wi:
L·Pl
c-s
!’Jl
l:’
·,.-t -‘
Hex
ASCII ASCII
ASCII Hex Character Dec
20 64 21A 65 22B 66 73′ 67 2 4 :::, 68 25′ 69 26f 70 2 7 ,::; 71 28′ 72 29′ 73 2Ac 74 2B ?. 75 zcL 76 2D :–: 77 2E -,; 78 2F- 79 30 i’ 80 31 ,, 81 32C 82 33 _, 83 34T 84 35- 85 36 86
ASCII
er
5C
:s1
J. .Le
·kc
:1_::::~:
:ic3
:·;.:·_·,1
r_o__,.:;
ey~1
~~b 23 17 can 24 18 ‘:’LT’, 25 19 ~–,,;_l.) 26 lA E:.SC 27 1B f;–: 28
-, 50
52 53 54 O’) 56 57 58
‘
,;
C
”
\a;
Dec Hex Character Dec
0 00Sp32 l 01’33
2 02 34
3 03 tr: 35
4 04s36
5 05 37
6 06:~38
7 07 39
8 08I40
9 09.)41
10 OA 42
11 OB”43
12 oc 4 4
13 OD-40
14 OE 46
15 OF-47
16 10 48
17 11 49
C
”
s s
lC’
29 1D
30 lE
31 lF
Character
40
4 1
42
43
44
45
46
47 g 48 49″ 4A ] 4 B ec 4C- 4D ‘1 4E c, 4FC 50 _l’ 51
Dec Hex
96 60
97 61
98 62
99 63
100 64
101 65
102 66
103 67
104 68
105 69
106 6A
107 6tJ
108 6C
109 6C
110 6E
111 6″
112 7C
113 71
114 72
115 73
116 74
117 75
118 76
119 77
120 78
121 79
122 7A
123 7B
121 7C
125 7D
126 7E
127 7F
18 12
19 13551
20 14 .52
21 15 553
n16 h54
”
55 37 ,, 87
06 3 8 57 39 58 3A 09 3B 60 3C 61 JD 62 3E 63 3F
:,: 88
,,
:.;: 89 59 ‘,’
90 5A
91 5B
92 5C
93 5D
94 SE
95 SF
7
a.
;”,
C d f_:’- f
E.3 Powers of 2
2′ 16 2′ 32 2″ 64 2′ 128 20
Common Abbreviation
11< 21< 41< BK 16K 32K 641< 128'< 256i< 512 !(
Decimal Conversion
2 2' 4 2' 8
Amount
21
256 2' 512
z1J z11 212 2:3 2:4 2:5
1,024
2,048
4,096 8,192 16,384 32,768
!·
16
2 65,536
217 zlS 219
131,072 262,144 544,288
220 1,048,576 lM
2,o 1,073,741,824 lG
2
,2
4,294,967,296 4G
E.3 Powers of 2 617
Solutions to selected exercises can be found on our website:
http://www.mhhe.com/patt2
Solutions to Selected Exercises