Today microprocessors and memories are made on distinct manufacturing lines, yielding 10M transistor microprocessors and 256M transistor DRAMs. Plants to manufacture these chips cost billions of dollars.
One of the biggest performance challenges in computer systems today is the speed mismatch between microprocessors and memory. To address this challenge processor designers now typically devote a large fraction of the transistors and area of the chips to large SRAM caches.
We predict that over the next decade processors and memory will be merged onto a single chip. Not only will this narrow or altogether remove the processor-memory performance gap, it will have the following additional benefits: provide an ideal building-block for parallel processing, amortize the costs of fabrication lines, and better utilize the phenomenal number of transistors that can be placed on a single chip. Let's dub it an "IRAM", for Intelligent RAM, since most of transistors on this merged chip will be devoted to memory.
A single gigabit IRAM should have an internal memory bandwidth of nearly 1000 gigabits per second (32K bits in 50 ns), a hundredfold increase over the fastest computers today. Hence the fastest programs will keep most memory accesses within a single IRAM, rewarding compact representations of code and data.
The initial efforts of the IRAM project were undertaken during the Spring 1996 offerring of CS 294-4 at UC Berkeley. This advanced graduate course, led by Prof. David Patterson, re-examined the design of hardware and software that is based on the traditional separation of the memory and the processor. The course web page contains a considerable amount of useful information, including copies of slides from many guest speakers as well as the results from three sets of projects performed by more than a dozen graduate students.
An earlier discussion that helped lead to the development of this course can be found in the article "Microprocessors in 2020", by Dave Patterson, in the September 1995 issue (pages 48-51), of Scientific American.
IRAM's large improvement in memory system bandwidth has significant potential for helping reconfigurable systems to achieve their full performance potential. Reconfigurable systems offer improved performance by adapting processing capabilities to application-specific needs. But by making the processing portion of an application go faster, a conventional memory system will be more and more of a drag on performance. Memory bandwidth is also the performance bottleneck to rapid reprogramming of the reconfigurable elements. For these reasons, the IRAM group is working closely with the BRASS (Berkeley Reconfigurable Architecture, Systems and Software) group, headed by Prof. John Wawrzynek and André DeHon.