[Back]
        Real-time Scheduling with Fault-tolerance in Heterogeneous Distributed Systems
 
Xiao Qin   Zongfen Han and Liping Pang
Huazhong University of Science and Technology, Wuhan, P.R.China
xqin@cse.unl.edu

Abstract: Due to the wide availability of high-speed network connectivities and powerful PCs, heterogeneous distributed systems are becoming  powerful and cost-effective computing platforms. Heterogeneous systems are increasingly being used for the management and control of a variety of applications such as digital signal processing, high-definition television, medical imaging, seismic and whether prediction systems, and other computation and data intensive applications. A number of real-time fault-tolerant scheduling algorithms have been intensively studied in the literature. These algorithms, however, are devised for homogeneous distributed systems comprising identical processors. Scheduling plays an important role in achieving high performance in heterogeneous systems, which consist of processors with various computation powers. This paper presents two real-time scheduling algorithms, RTFTNO and RTFTRC, which aim at mapping tasks on processors and order their execution in such a way that real-time requirements of tasks are satisfied and a minimum schedule length, when attainable, is given. Reliability cost, an important performance metric, is introduced into our study to measure the reliability of heterogeneous systems. The RTFTRC algorithm allocates each task to the processor that leads to the minimum reliability cost. This scheme is able to enhance the reliability of the systems without any extra hardware. The RTFTNO algorithm, however, does not take reliability cost into account. The simulation results indicate that under the same workload, the reliability cost generated by the RTFTRC is significantly less than that generated by RTFTNO. In addition, the results show that the schedule length generated by RTFTRC is shorter than that generated by RTFTNO and therefore,  RTFTRC has lower PMD (percentage of missing deadlines) than that of RTFTNO.

Key words: Fault-tolerant, real-time scheduling, heterogeneous distributed systems, simulation experiments, performance evaluation

Chinese Journal of Computer, Vol.25, No.1, January 2002.