Memory systems for conventional large-scale computers provide only limited bytes/s of data bandwidth when compared to their flop/s of instruction execution rate. The resulting bottleneck limits the bytes/flop that a processor may access from the full memory footprint of a machine and can hinder overall performance. This paper discusses physical and functional views of memory hierarchies and examines existing ratios of bandwidth to execution rate versus memory capacity (or bytes/flop versus capacity) found in a number of large-scale computers. The paper then explores a set of technologies, Proximity Communication, low-power on-chip networks, dense optical communication, and Sea-of-Anything interconnect, that can flatten this bandwidth hierarchy to relieve the memory bottleneck in a large-scale computer that we call "Hero."
Citation:
Robert Drost, Craig Forrest, Bruce Guenin, Ron Ho, Ashok V. Krishnamoorthy, Danny Cohen, John E. Cunningham, Bernard Tourancheau, Arthur Zingher, Alex Chow, Gary Lauterbach, Ivan Sutherland, "Challenges in Building a Flat-Bandwidth Memory Hierarchy for a Large-Scale Computer with Proximity Communication," hoti, pp.13-22, 13th Symposium on High Performance Interconnects (HOTI'05), 2005