Fault Localization and Recovery in Crossbar ATM Switches

Minseok OH

Publication
IEICE TRANSACTIONS on Communications   Vol.E88-B    No.7    pp.2908-2917
Publication Date: 2005/07/01
Online ISSN: 
DOI: 10.1093/ietcom/e88-b.7.2908
Print ISSN: 0916-8516
Type of Manuscript: PAPER
Category: Network Management/Operation
Keyword: 
fault management,  multichannel switches,  crossbar switches,  

Full Text: PDF(634.3KB)>>
Buy this Article



Summary: 
The multichannel switch is an architecture widely used for ATM (Asynchronous Transfer Mode). It is known that the fault tolerant characteristic can be incorporated in into the multichannel crossbar switching fabric. For example, if a link belonging to a multichannel group fails, the remaining links can assume responsibility for some of the traffic on the failed link. On the other hand, if a fault occurs in a switching element, it can lead to erroneous routing and sequencing in the multichannel switch. We investigate several fault localization algorithms in multichannel crossbar ATM switches with a view to early fault recovery. The optimal algorithm gives the best performance in terms of time to localization but is computationally complex, which makes it difficult to operate in real time. We develop an online algorithm which is computationally more efficient than the optimal one. We evaluate its performance through simulation. The simulation results show that the performance of the online algorithm is only slightly suboptimal for both random and bursty traffic. There are cases where the proposed online algorithm cannot pinpoint down to a single fault. We explain the causes and enumerate those cases. Finally, a fault recovery algorithm is described which utilizes the information provided by the fault localization algorithm. The fault recovery algorithm adds extra rows and columns to allow cells to detour the faulty element.