ADVANCED COMPUTER ARCHITECTURE BOOK
Advanced Computer Architecture [HWANG] on ppti.info *FREE* shipping on qualifying offers. The new edition offers a balanced treatment of theory. Advanced Computer Architecture book. Read 6 reviews from the world's largest community for readers. The new edition offers a balanced treatment of theory. ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING Advanced Computer Architecture - Parallelism Scalability and Programmability.
|Language:||English, Spanish, Indonesian|
|ePub File Size:||29.69 MB|
|PDF File Size:||14.82 MB|
|Distribution:||Free* [*Regsitration Required]|
This book covers the syllabus of GGSIPU, DU, UPTU, PTU, MDU, Pune University and many other universities. It is useful for ppti.info(CSE/IT), ppti.info( CSE). The salient features of the book are as follows:• Hybrid Elements including topics like Memory organization, Binary representation of data, Computer arithmetic. This book constitutes the refereed proceedings of the 11th Annual Conference on Advanced Computer Architecture, ACA , held in Weihai, China, in August.
The result of great architecture, whether in computer design, building design or textbook design, is to take the customer's requirements and desires and return a design that causes that customer to say, "Wow, I didn't know that was possible. Performance and Price-Performance 44 1. Concepts and Challenges 66 2. Examples and the Algorithm 97 2. Hardware versus Software Speculation 3. Thread-Level Parallelism 3. Performance and Efficiency in Advanced.
Multiple-Issue Processors 3.
The Basics 4. An Introduction 4. Virtual Memory and Virtual Machines 5. The Design of Memory Hierarchies 5. Archive Cluster 6. The Role of Compilers B. I Introduction C. Through four editions of this book, our goal has been to describe the basic princi- ples underlying what will be tomorrow's technological developments.
Our excitement about the opportunities in computer architecture has not abated, and we echo what we said about the field in the first edition: It's a discipline of keen intellectual interest, requiring the balance of marketplace forces to cost-performance-power, leading to glorious failures and some notable successes.
Our primary objective in writing our first book was to change the way people learn and think about computer architecture. We feel this goal is still valid and important. The field is changing daily and must be studied with real examples and measurements on real computers, rather than simply as a collection of defini- tions and designs that will never need to be realized.
We offer an enthusiastic welcome to anyone who came along with us in the past, as well as to those who are joining us now. Either way, we can promise the same quantitative approach to, and analysis of, real systems.
As with earlier versions, we have strived to produce a new edition that will continue to be as relevant for professional engineers and architects as it is for those involved in advanced computer architecture and design courses.
As much as its predecessors, this edition aims to demystify computer architecture through an emphasis on cost-performance-power trade-offs and good engineering design. We believe that the field has continued to mature and move toward the rigorous quantitative foundation of long-established scientific and engineering disciplines. The fourth edition of Computer Architecture: A Quantitative Approach may be the most significant since the first edition. Shortly before we started this revision, Intel announced that it was joining IBM and Sun in relying on multiple proces- sors or cores per chip for high-performance designs.
As the first figure in the book documents, after 16 years of doubling performance every 18 months, sin-. This fork in the computer architecture road means that for the first time in history, no one is building a much faster sequential processor. If you want your program to run significantly faster, say, to justify the addition of new features, you're going to have to parallelize your program.
Hence, after three editions focused primarily on higher performance by exploiting instruction-level parallelism ILP , an equal focus of this edition is thread-level parallelism TLP and data-level parallelism DLP.
This historic shift led us to change the order of the chapters: The changing technology has also motivated us to move some of the content from later chapters into the first chapter.
Because technologists predict much higher hard and soft error rates as the industry moves to semiconductor processes with feature sizes 65 nm or smaller, we decided to move the basics of dependabil- ity from Chapter 7 in the third edition into Chapter 1. As power has become the dominant factor in determining how much you can place on a chip, we also beefed up the coverage of power in Chapter 1.
Of course, the content and exam- ples in all chapters were updated, as we discuss below. In addition to technological sea changes that have shifted the contents of this edition, we have taken a new approach to the exercises in this edition.
It is sur- prisingly difficult and time-consuming to create interesting, accurate, and unam- biguous exercises that evenly test the material throughout a chapter. Alas, the Web has reduced the half-life of exercises to a few months. Rather than working out an assignment, a student can search the Web to find answers not long after a book is published.
Hence, a tremendous amount of hard work quickly becomes unusable, and instructors are denied the opportunity to test what students have learned. To help mitigate this problem, in this edition we are trying two new ideas. First, we recruited experts from academia and industry on each topic to write the exercises. This means some of the best people in each field are helping us to cre- ate interesting ways to explore the key concepts in each chapter and test the reader's understanding of that material.
Second, each group of exercises is orga- nized around a set of case studies. Our hope is that the quantitative example in each case study will remain interesting over the years, robust and detailed enough to allow instructors the opportunity to easily create their own new exercises, should they choose to do so.
Key, however, is that each year we will continue to release new exercise sets for each of the case studies. These new exercises will have critical changes in some parameters so that answers to old exercises will no longer apply. Another significant change is that we followed the lead of the third edition of Computer Organization and Design COD by slimming the text to include the material that almost all readers will want to see and moving the appendices that.
There were many reasons for this change:. Students complained about the size of the book, which had expanded from pages in the chapters plus pages of appendices in the first edition to chapter pages plus appendix pages in the second edition and then to chapter pages plus pages in the paper appendices and pages in online appendices. At this rate, the fourth edition would have exceeded pages both on paper and online!
Similarly, instructors were concerned about having too much material to cover in a single course. As was the case for COD, by including a CD with material moved out of the text, readers could have quick access to all the material, regardless of their ability to access Elsevier's Web site. Hence, the current edition's appendices will always be available to the reader even after future editions appear.
This flexibility allowed us to move review material on pipelining, instruction sets, and memory hierarchy from the chapters and into Appendices A, B, and C.
Advanced Computer Architectures
The advantage to instructors and readers is that they can go over the review material much more quickly and then spend more time on the advanced top- ics in Chapters 2, 3, and 5. It also allowed us to move the discussion of some topics that are important but are not core course topics into appendices on the CD.
In this edition we have 6 chapters, none of which is longer than 80 pages, while in the last edition we had 8 chapters, with the longest chapter weighing in at pages. This package of a slimmer core print text plus a CD is far less expensive to manufacture than the previous editions, allowing our publisher to signifi- cantly lower the list price of the book.
With this pricing scheme, there is no need for a separate international student edition for European readers.
Yet another major change from the last edition is that we have moved the embedded material introduced in the third edition into its own appendix, Appen- dix D. We felt that the embedded material didn't always fit with the quantitative evaluation of the rest of the material, plus it extended the length of many chapters that were already running long. We believe there are also pedagogic advantages in having all the embedded information in a single appendix.
This edition continues the tradition of using real-world examples to demon- strate the ideas, and the "Putting It All Together" sections are brand new; in fact, some were announced after our book was sent to the printer. As before, we have taken a conservative approach to topic selection, for there are many more interesting ideas in the field than can reasonably be covered in a treat- ment of basic principles.
We have steered away from a comprehensive survey of every architecture a reader might encounter. Instead, our presentation focuses on core concepts likely to be found in any new machine. The key criterion remains that of selecting ideas that have been examined and utilized successfully enough to permit their discussion in quantitative terms. Our intent has always been to focus on material that is not available in equiva- lent form from other sources, so we continue to emphasize advanced content wherever possible.
Indeed, there are several systems here whose descriptions cannot be found in the literature. Readers interested strictly in a more basic introduction to computer architecture should read Computer Organization and Design: Chapter 1 has been beefed up in this edition.
It includes formulas for static power, dynamic power, integrated circuit costs, reliability, and availability. We go into more depth than prior editions on the use of the geometric mean and the geo- metric standard deviation to capture the variability of the mean. Our hope is that these topics can be used through the rest of the book. In addition to the classic quantitative principles of computer design and performance measurement, the benchmark section has been upgraded to use the new SPEC suite.
Our view is that the instruction set architecture is playing less of a role today than in , so we moved this material to Appendix B. It still uses the MIPS64 architecture.
Buy for others
Chapters 2 and 3 cover the exploitation of instruction-level parallelism in high-performance processors, including superscalar execution, branch prediction, speculation, dynamic scheduling, and the relevant compiler technology. As men- tioned earlier, Appendix A is a review of pipelining in case you need it. Chapter 3 surveys the limits of ILR New to this edition is a quantitative evaluation of multi- threading. While the last edition contained a great deal on Itanium, we moved much of this material to Appendix G, indicating our view that this architecture has not lived up to the early claims.
Given the switch in the field from exploiting only ILP to an equal focus on thread- and data-level parallelism, we moved multiprocessor systems up to Chap- ter 4, which focuses on shared-memory architectures. The chapter begins with the performance of such an architecture. It then explores symmetric and distributed-memory architectures, examining both organizational principles and performance. Topics in synchronization and memory consistency models are.
The example is the Sun Tl "Niagara" , a radical design for a commercial product. It reverted to a single-instruction issue, 6-stage pipeline microarchitec- ture. It put 8 of these on a single chip, and each supports 4 threads. Hence, soft- ware sees 32 threads on this single, low-power chip. As mentioned earlier, Appendix C contains an introductory review of cache principles, which is available in case you need it.
This shift allows Chapter 5 to start with 11 advanced optimizations of caches. The chapter includes a new sec- tion on virtual machines, which offers advantages in protection, software man- agement, and hardware management. The example is the AMD Opteron, giving both its cache hierarchy and the virtual memory scheme for its recently expanded bit addresses. Chapter 6, "Storage Systems," has an expanded discussion of reliability and availability, a tutorial on RAID with a description of RAID 6 schemes, and rarely found failure statistics of real systems.
Rather than go through a series of steps to build a hypothetical cluster as in the last edition, we evaluate the cost, performance, and reliability of a real cluster: This brings us to Appendices A through L. As mentioned earlier, Appendices A and C are tutorials on basic pipelining and caching concepts. Readers relatively new to pipelining should read Appendix A before Chapters 2 and 3, and those new to caching should read Appendix C before Chapter 5.
Appendix E, on networks, has been extensively revised by Timothy M. Pink- ston and Jose Duato.
Appendix F, updated by Krste Asanovic, includes a descrip- tion of vector processors. We think these two appendices are some of the best material we know of on each topic. Appendix H describes parallel processing applications and coherence proto- cols for larger-scale, shared-memory multiprocessing.
Appendix I, by David Goldberg, describes computer arithmetic. Appendix K collects the "Historical Perspective and References" from each chapter of the third edition into a single appendix. It attempts to give proper credit for the ideas in each chapter and a sense of the history surrounding the inventions. We like to think of this as presenting the human drama of computer design. It also supplies references that the student of architecture may want to pursue.
If you have time, we recommend reading some of the classic papers in the field that are mentioned in these sections. It is both enjoyable and educational.
Appendix L available at textbooks. There is no single best order in which to approach these chapters and appendices, except that all readers should start with Chapter 1. If you don't want to read everything, here are some suggested sequences:.
Appendix D can be read at any time, but it might work best if read after the ISA and cache sequences. Appendix I can be read whenever arithmetic moves you. The material we have selected has been stretched upon a consistent framework that is followed in each chapter. We start by explaining the ideas of a chapter.
These ideas are followed by a "Crosscutting Issues" section, a feature that shows how the ideas covered in one chapter interact with those given in other chapters. This is followed by a "Putting It All Together" section that ties these ideas together by showing how they are used in a real machine.
Next in the sequence is "Fallacies and Pitfalls," which lets readers learn from the mistakes of others. We show examples of common misunderstandings and architectural traps that are difficult to avoid even when you know they are lying in wait for you. The "Fallacies and Pitfalls" sections is one of the most popular sec- tions of the book.
Each chapter ends with a "Concluding Remarks" section. Each chapter ends with case studies and accompanying exercises. Authored by experts in industry and academia, the case studies explore key chapter concepts and verify understanding through increasingly challenging exercises.
Instructors should find the case studies sufficiently detailed and robust to allow them to cre- ate their own additional exercises. We hope this helps readers to avoid exercises for which they haven't read the corresponding section, in addition to providing the source for review. Note that we provide solutions to the case study. Exercises are rated, to give the reader a sense of the amount of time required to complete an exercise:. A second set of alternative case study exercises are available for instructors who register at textbooks.
This second set will be revised every summer, so that early every fall, instructors can download a new set of exercises and solutions to accompany the case studies in the book. Additional resources are available at textbooks.
The instructor site accessible to adopters who register at textbooks. New materials and links to other resources available on the Web will be added on a regular basis. Finally, it is possible to make money while reading this book. Talk about cost- performance! If you read the Acknowledgments that follow, you will see that we went to great lengths to correct mistakes.
Since a book goes through many print- ings, we have the opportunity to make even more corrections. If you uncover any remaining resilient bugs, please contact the publisher by electronic mail ca4bugs mkp.
These include: Data processing other than the CPU, such as direct memory access DMA Other issues such as virtualization , multiprocessing , and software features. There are other types of computer architecture. Also, messages that the processor should emit so that external caches can be invalidated emptied. Pin architecture functions are more flexible than ISA functions because external hardware can adapt to new encodings, or change from a pin to a message.
The term "architecture" fits, because the functions must be provided for compatible systems, even if the detailed method changes.
Advanced computer architecture books
Definition[ edit ] The purpose is to design a computer that maximizes performance while keeping power consumption in check, costs low relative to the amount of expected performance, and is also very reliable. For this, many aspects are to be considered which includes instruction set design, functional organization, logic design, and implementation. The implementation involves integrated circuit design, packaging, power, and cooling. Optimization of the design requires familiarity with compilers, operating systems to logic design, and packaging.
Please help improve this section by adding citations to reliable sources. Unsourced material may be challenged and removed. March Main article: Instruction set architecture An instruction set architecture ISA is the interface between the computer's software and hardware and also can be viewed as the programmer's view of the machine.
A processor only understands instructions encoded in some numerical fashion, usually as binary numbers. Software tools, such as compilers , translate those high level languages into instructions that the processor can understand.
Besides instructions, the ISA defines items in the computer that are available to a program—e. Instructions locate these available items with register indexes or names and memory addressing modes.
The ISA of a computer is usually described in a small instruction manual, which describes how the instructions are encoded. Also, it may define short vaguely mnemonic names for the instructions.
The names can be recognized by a software development tool called an assembler. An assembler is a computer program that translates a human-readable form of the ISA into a computer-readable form. Disassemblers are also widely available, usually in debuggers and software programs to isolate and correct malfunctions in binary computer programs.
ISAs vary in quality and completeness. A good ISA compromises between programmer convenience how easy the code is to understand , size of the code how much code is required to do a specific action , cost of the computer to interpret the instructions more complexity means more hardware needed to decode and execute the instructions , and speed of the computer with more complex decoding hardware comes longer decode time.
Memory organization defines how instructions interact with the memory, and how memory interacts with itself. During design emulation software emulators can run programs written in a proposed instruction set.
Modern emulators can measure size, cost, and speed to determine if a particular ISA is meeting its goals. Main article: Microarchitecture Computer organization helps optimize performance-based products.
For example, software engineers need to know the processing power of processors. They may need to optimize software in order to gain the most performance for the lowest price. This can require quite detailed analysis of the computer's organization. For example, in a SD card, the designers might need to arrange the card so that the most data can be processed in the fastest possible way.
Computer organization also helps plan the selection of a processor for a particular project. Multimedia projects may need very rapid data access, while virtual machines may need fast interrupts. Sometimes certain tasks need additional components as well. For example, a computer capable of running a virtual machine needs virtual memory hardware so that the memory of different virtual computers can be kept separated. Computer organization and features also affect power consumption and processor cost.
Main article: Implementation Once an instruction set and micro-architecture are designed, a practical machine must be developed.
This design process is called the implementation. Implementation is usually not considered architectural design, but rather hardware design engineering. Implementation can be further broken down into several steps: Logic Implementation designs the circuits required at a logic gate level Circuit Implementation does transistor -level designs of basic elements gates, multiplexers, latches etc.
Physical Implementation draws physical circuits. The different circuit components are placed in a chip floorplan or on a board and the wires connecting them are created.
Design Validation tests the computer as a whole to see if it works in all situations and all timings. Once the design validation process starts, the design at the logic level are tested using logic emulators. However, this is usually too slow to run realistic test. Most hobby projects stop at this stage. The final step is to test prototype integrated circuits.Priyanka Patil rated it really liked it Jan 17, Hill, University of Wisconsin-Madison.
The main course text is Hennessy and Patterson, see below. We felt that the embedded material didn't always fit with the quantitative evaluation of the rest of the material, plus it extended the length of many chapters that were already running long.
As power has become the dominant factor in determining how much you can place on a chip, we also beefed up the coverage of power in Chapter 1. Shifts in market demand[ edit ] Increases in clock frequency have grown more slowly over the past few years, compared to power reduction improvements.