This paper evaluates the behavior of a common off-the-shelf (COTS) database management system (DBMS) in presence of transient faults. Database applications have traditionally been a field with fault-tolerance needs, concerning both data integrity and availability. While most of the commercially available DBMS provide support for data recovery and fault-tolerance, very limited knowledge was available regarding the impact of transient faults in a COTS database system. In this experimental study, a strict off-the-shelf target system is used (Oracle 7.3 server running on top of Wintel platform), combined with a TPC-A based workload and a software implemented fault injection tool - XceptionNT. It was found out that a non-negligible amount of induced faults - 13% - lead to database server hanging or premature termination. However, the results also show that COTS DBMS products has a reasonable behavior concerning data integrity - none of the injected faults affected end user data.
Citation:
Diamantino Costa, Henrique Madeira, "Experimental Assessment of COTS DBMS Robustness under Transient Faults," prdc, pp.201, Sixth Pacific Rim International Symposium on Dependable Computing (PRDC'99), 1999