Modern cluster interconnection networks rely on processing on the network interface to deliver higher bandwidth and lower latency than what could be achieved otherwise. These processors are relatively slow, but they provide adequate capabilities to accelerate some portion of the protocol stack in a cluster computing environment. This offload capability is conceptually appealing, but the standard evaluation of NIC-based protocol implementations relies on simplistic microbenchmarks that create idealized usage scenarios. In this paper, we evaluate characteristics ofMPI usage scenarios using application benchmarks to help define the parameter space that protocol offload implementations should target. Specifically, we analyze characteristics that we expect to have an impact on NIC resource allocation and management strategies, including the length of the MPI posted receive and unexpected message queues, the number of entries in these queues that are examined for a typical operation, and the number of unexpected and expected messages.
Citation:
Ron Brightwell, Keith D. Underwood, "An Analysis of NIC Resource Usage for Offloading MPI," ipdps, vol. 9, pp.183a, 18th International Parallel and Distributed Processing Symposium (IPDPS'04) - Workshop 8, 2004