Search the FAQ Archives

3 - A - B - C - D - E - F - G - H - I - J - K - L - M
N - O - P - Q - R - S - T - U - V - W - X - Y - Z
faqs.org - Internet FAQ Archives

Comp.os.research: Frequently answered questions [3/3: l/m 13 Aug 1996]
Section - [1.5.5] Fault tolerance

( Part1 - Part2 - Part3 - Single Page )
[ Usenet FAQs | Web FAQs | Documents | RFC Index | Schools ]


Top Document: Comp.os.research: Frequently answered questions [3/3: l/m 13 Aug 1996]
Previous Document: [1.5.4] Address space structure
Next Document: [1.5.6] A brief bibliography on distributed shared memory
See reader questions & answers on this topic! - Help others by sharing your knowledge
From: Distributed systems

Most DSM systems ignore the fault tolerance issue or maintain that it
is an operating system issue and should be handled by the underlying
system.  However, it would appear that in practice a DSM system would
strongly effect the fault tolerance of a system.  For example, in a
system where several systems are sharing access to a set of data, the
failure of any one of them could lead to the failure of all the
connected sites (or, at least, some of the processes on each site).
We are also presented with an unusual failure handling problem.  It is
fairly easy to see how to handle a failed message or RPC, but how do
you handle a failed page fault?

The original Clouds system provided recoverability using shadowing of
segments and a transactional system using commits.  The recovery
system was not really integrated with the DSM system and was merely
implemented at the segment storage site.  In order to maintain a
consistent view of data when one transaction is active at multiple
nodes, they have more recently been forced to integrate the
transaction system with the DSM support system.

User Contributions:

Comment about this article, ask questions, or add new information about this topic:

CAPTCHA




Top Document: Comp.os.research: Frequently answered questions [3/3: l/m 13 Aug 1996]
Previous Document: [1.5.4] Address space structure
Next Document: [1.5.6] A brief bibliography on distributed shared memory

Part1 - Part2 - Part3 - Single Page

[ Usenet FAQs | Web FAQs | Documents | RFC Index ]

Send corrections/additions to the FAQ Maintainer:
os-faq@cse.ucsc.edu





Last Update March 27 2014 @ 02:12 PM