Entries |
Document | Title | Date |
20080201602 | Method and apparatus for transactional fault tolerance in a client-server system - Method and apparatus for transactional fault tolerance in a client-server system is described. In one example, output data generated by execution of a service on a primary server during a current epoch between a first checkpoint and a second checkpoint is buffered. A copy of an execution context of the primary server is established on a secondary server in response to the second checkpoint. The output data as buffered is released from the primary server in response to establishment of the copy of the execution context on the secondary server. | 08-21-2008 |
20080209258 | Disaster Recovery Architecture - A method and system for disaster recovery in a packet-based network. The network includes a production site and a recovery site coupled together by the packet-based network. Mirroring software on the production site keeps a recovery site up to date to the last transaction occurring on the production site. A recovery control server polls the production site in order to detect a disaster condition or other failure. Upon detection of a problem at the production site, the recovery control server reconfigures the network so those attempts to access the production site are routed to the recovery site. | 08-28-2008 |
20080215910 | High-Availability Networking with Intelligent Failover - Methods and systems for maintaining high-availability in a computer network using intelligent failover are presented. In a network switch running an OSI model layer-2 or higher protocol on its external links, the protocol state information is monitored to determine failover status of the link to avoid identifying external link failures due to link flapping. One such protocol is the spanning tree protocol. Additionally, flexibility in failover is provided using configurable triggers to define external failure events. The triggers initiate a link drop of one or more internal links of the network switch in response to an external failure event. The link drops, in turn, initiate failover of an attached computing device to a redundant link through a network interface teaming/failover arrangement whereby the computing device switches to an alternative network interface accessing the network through a redundant path. Failover can be selective depending upon VLAN and trunking configurations. | 09-04-2008 |
20080229142 | SELF-SERVICE RECOVERY OF APPLICATION DATA - Self-service recovery of application data. A list of recoverable objects for the application is generated in response to the receipt of a request for an application recovery from a user. The list of recoverable objects for the application is sent to the user. A selected recoverable object from the user is received. In response, the execution of a recovery job on the backup and restore application is initiated for the selected recoverable object. | 09-18-2008 |
20080235533 | Fall over method through disk take over and computer system having failover function - When a primary server executing a task fails in a computer system where a plurality of servers are connected to an external disk device via a network and the servers boot an operation system from the external disk device, task processing is taken over from the primary server to a server that is not executing a task in accordance with the following method. The method for taking over a task includes the steps of detecting that the primary server fails; searching the computer system for a server that has the same hardware configuration as that of the primary server and that is not running a task; enabling the server, searched for as a result of the search, to access the external disk device; and booting the server from the external disk device. | 09-25-2008 |
20080244307 | Method to avoid continuous application failovers in a cluster - A method and mechanism for failing over applications in a clustered computing system is provided. In an embodiment, the methodology is implemented by a high-availability failover mechanism. Upon detecting a failure of an application that is currently designated to be executing on a particular node of the system, the mechanism may attempt to failover the application onto a different node. The mechanism keeps track of a number of nodes on which a failover of the application is attempted. Then, based on one or more factors including the number of nodes on which a failover of the application is attempted, the mechanism may cease to attempt to failover the application onto a node of the system. | 10-02-2008 |
20080250265 | SYSTEMS AND METHODS FOR PREDICTIVE FAILURE MANAGEMENT - A system and method for using continuous failure predictions for proactive failure management in distributed cluster systems includes a sampling subsystem configured to continuously monitor and collect operation states of different system components. An analysis subsystem is configured to build classification models to perform on-line failure predictions. A failure prevention subsystem is configured to take preventive actions on failing components based on failure warnings generated by the analysis subsystem. | 10-09-2008 |
20080250266 | LOGICAL PARTITIONING OF A PHYSICAL DEVICE - In one embodiment, an indication of a fault condition is received relating to a first service running on a physical device in a computer network. The first service is associated with a first virtual device context defined on the physical device. Then, the first service is disabled without affecting operation of a second service on the physical device. The second service is associated with a second virtual device context defined on the physical device. In another embodiment, a first virtual device context is created on a physical device in a computer network. Then, a second virtual device context is created on the physical device. The first virtual device context may then be managed independently of the second virtual device context such that resources assigned to a virtual device context are managed without affecting management of another virtual device context. | 10-09-2008 |
20080250267 | Method and system for coordinated multiple cluster failover - Hyperclusters are a cluster of clusters. Each cluster has associated with it one or more resource groups, and independent node failures within the clusters are handled by platform specific clustering software. The management of coordinated failovers across dependent or independent resources running on heterogeneous platforms is contemplated. A hypercluster manager running on all of the nodes in a cluster communicates with platform specific clustering software regarding any failure conditions, and utilizing a rule-based decision making system, determines actions to take on the node. A plug-in extends exit points definable in non-hypercluster clustering technologies. The failure notification is passed to other affected resource groups in the hypercluster. | 10-09-2008 |
20080256384 | Mechanism for Recovery from Site Failure in a Stream Processing System - A failure recovery framework to be used in cooperative data stream processing is provided that can be used in a large-scale stream data analysis environment. Failure recovery supports a plurality of independent distributed sites, each having its own local administration and goals. The distributed sites cooperate in an inter-site back-up mechanism to provide for system recovery from a variety of failures within the system. Failure recovery is both automatic and timely through cooperation among sites. Back-up sites associated with a given primary site are identified. These sites are used to identify failures within the primary site including failures of applications running on the nodes of the primary site. The failed applications are reinstated on one or more nodes within the back-up sites using job management instances local to the back-up sites in combination with previously stored state information and data values for the failed applications. In additions to inter-site mechanisms, each one of the plurality of sites employs an intra-site back-up mechanism to handle failure recoveries within the site. | 10-16-2008 |
20080256385 | OPTIMIZATION OF PORT LINK RECOVERY - Provided are techniques for determining a link speed. When a link between two computing devices is operational, a link speed for use in communicating across the link is stored and a remembered indicator is set to TRUE. After any event occurs that causes the link to become inoperational, in response to determining that the remembered indicator is TRUE, the stored link speed is used when attempting to make the link become operational. | 10-16-2008 |
20080263386 | DYNAMICALLY REROUTING NODE TRAFFIC ON A MASSIVELY PARALLEL COMPUTER SYSTEM USING HINT BITS - A method and apparatus for dynamically rerouting node processes on the compute nodes of a massively parallel computer system using hint bits to route around failed nodes or congested networks without restarting applications executing on the system. When a node has a failure or there are indications that it may fail, the application software on the system is suspended while the data on the failed node is moved to a backup node. The torus network traffic is routed around the failed node and traffic for the failed node is rerouted to the backup node. The application can then resume operation without restarting from the beginning. | 10-23-2008 |
20080263387 | FAULT RECOVERY ON A PARALLEL COMPUTER SYSTEM WITH A TORUS NETWORK - An apparatus and method for overcoming a torus network failure in a parallel computer system. A mesh routing mechanism in the service node of the computer system configures the nodes from a torus to a mesh network when a failure occurs in the torus network. The mesh routing mechanism takes advantage of cutoff registers in each node to route node to node data transfers around the faulty node or network connection. | 10-23-2008 |
20080263388 | METHOD AND APPARATUS FOR MANAGING CUSTOMER TOPOLOGIES - A method and apparatus for managing customer topologies on packet networks are disclosed. For example, the method creates at least two event correlation instances for at least one customer topology, where a first event correlation instance resides in a primary availability management server, and a second event correlation instance resides in a secondary availability management server. The method also creates a test node for the first event correlation instance, where the test node provides at least one test message. The method then receives at least one response generated by the first event correlation instance that is responsive to the at least one test message, where the at least one response is received by the second event correlation instance. The method then performs a fail-over to the second event correlation instance from the first event correlation instance if a failure is detected from the at least one response. | 10-23-2008 |
20080263389 | SYSTEM FOR MONITORING ENUM PERFORMANCE - A system for monitoring tElephone NUmber Mapping (ENUM) performance is disclosed. A system that incorporates teachings of the present disclosure may include, for example, an ENUM system having a subsystem to monitor one or more operations of the ENUM system, and generate a fault notice responsive to detecting one or more faults in the operations monitored. Additional embodiments are disclosed. | 10-23-2008 |
20080263390 | Cluster system and failover method for cluster system - Even when a large number of guest OSs exist, a failover method meeting high availability needed by the guest OSs is provided for the each guest OS. In the event of a physical or logical change of a system, or change of operation states, a smooth failover method can be realized by preventing the consumption of resource amounts due to excessive failover methods, and the occurrence of systemdown due to an inadequate failover method. In a server virtualization environment, in a cluster configuration having a failover method due to hot standby and cold standby, by selecting a failover method meeting high availability requirements specifying performance during failover of applications on the guest OSs, a suitable cluster configuration is realized. Failure monitoring is realized by quantitative heartbeat. | 10-23-2008 |
20080270822 | DATA REPLICA SELECTOR - There is provided a method and system for replicating data at another location. The system includes a source node that contains data in a data storage area. The source node is coupled to a network of potential replication nodes. The processor determines at least two eligible nodes in the network of nodes and determines the communication cost associated with a each of the eligible nodes. The processor also determines a probability of a concurrent failure of the source node and each of eligible nodes, and selects at least one of the eligible nodes for replication of the data located on the source node. The selection is based on the determined communication costs and probability of concurrent failure. | 10-30-2008 |
20080276117 | END-TO-END TRANSACTIONAL PROTECTION FOR REQUESTS IN A WEB APPLICATION - Various embodiments of a system and method for processing a request in a distributed software application are disclosed. In response to a client request, one or more server computers may modify a plurality of different portions of state information. The system may operate to ensure that the portions of state information are all modified atomically. The system may also operate to provide transparent connection failover functionality for the network connection between the client computer and the one or more server computers. | 11-06-2008 |
20080276118 | AUTONOMICALLY ADJUSTING CONFIGURATION PARAMETERS FOR A SERVER WHEN A DIFFERENT SERVER FAILS - A load balancer detects a server failure, and sends a failure notification message to the remaining servers. In response, one or more of the remaining servers may autonomically adjust their configuration parameters, thereby allowing the remaining servers to better handle the increased load caused by the server failure. One or more of the servers may also include a performance measurement mechanism that measures performance before and after an autonomic adjustment of the configuration parameters to determine whether and how much the autonomic adjustments improved the system performance. In this manner server computer systems may autonomically compensate for the failure of another server computer system that was sharing the workload. | 11-06-2008 |
20080276119 | AUTONOMICALLY ADJUSTING CONFIGURATION PARAMETERS FOR A SERVER WHEN A DIFFERENT SERVER FAILS - A load balancer detects a server failure, and sends a failure notification message to the remaining servers. In response, one or more of the remaining servers may autonomically adjust their configuration parameters, thereby allowing the remaining servers to better handle the increased load caused by the server failure. One or more of the servers may also include a performance measurement mechanism that measures performance before and after an autonomic adjustment of the configuration parameters to determine whether and how much the autonomic adjustments improved the system performance. In this manner server computer systems may autonomically compensate for the failure of another server computer system that was sharing the workload. | 11-06-2008 |
20080276120 | Volume and failure management method on a network having a storage device - A SAN manager acquires configuration information from devices constituting a SAN and produces a corresponding relationship between a host computer and a virtual volume (virtual volume mapping) and a corresponding relationship between the host computer and a real volume (real volume mapping). Based on those pieces of mapping information, the SAN manager outputs a corresponding relationship between virtual and real volumes. Meanwhile, the failure notification messages received from the in-SAN devices are construed to detect and output an influence of the failure upon the access to a real or virtual volume. Furthermore, when receiving a plurality of failure notifications from the devices connected to the SAN, the plurality of failure notifications are outputted with an association based on the corresponding relationship between real and virtual volumes. | 11-06-2008 |
20080288811 | Multi-node replication systems, devices and methods - Replication techniques are presented. According to an embodiment of a method, a node of a replicated storage network is assigned to be an owner of a data block to issue write memory block commands. The network includes at least two nodes including the node assigned to be the owner. If a read memory block command is received to read the data block, a read_lock is issued for the data block, the data block is read, and the read_lock for the data block is released. If a write memory block command is received to write new data to the data block, a write_lock is issued for the data block, the data block is written and a version associated with the data block is incremented, and the write_lock for the data block is released. | 11-20-2008 |
20080288812 | CLUSTER SYSTEM AND AN ERROR RECOVERY METHOD THEREOF - A cluster system including: a transmission side server cluster consisting of a plurality of computers; a receiving side server cluster consisting of a plurality of computers; and a network that interconnects both the transmission side server cluster and the receiving side server cluster, wherein an active-transmission computer which is included in the transmission side server cluster selects a standby-transmission computer from the computers in the transmission side server cluster, based on load information, and transmits a backup-copy of a message to the standby-transmission computer when the active-transmission computer transmits the message to a computer in the receiving side server, the stand-by transmission computer for back-up handling of the message in the event of occurrence of a fault in the active-transmission computer. | 11-20-2008 |
20080301488 | INTELLIGENT CONFIGURATION FOR RESTARTING FAILED APPLICATION SERVER INSTANCES - An improved solution for intelligent configuration for restarting failed application server instances is provided. In an embodiment of the invention, a method for restarting a failed application server instance includes: receiving a notice of a failure of an application server instance; obtaining a cause of the failure; automatically applying at least one configuration change to the application server instance based on the cause; and recovering the application server instance. | 12-04-2008 |
20080301489 | MULTI-AGENT HOT-STANDBY SYSTEM AND FAILOVER METHOD FOR THE SAME - The present invention discloses a multi-agent hot-standby system and a failover method for the same, which utilize a plurality of cascaded standby servers to monitor and detect a plurality of application servers, wherein a standby server is parallel connected with all the application servers, and the cascaded standby servers monitor each other. When one application server malfunctions and sends an abnormal heartbeat signal to the standby server directly connected thereto, the standby server immediately replaces the malfunctioning application server. At the same time, another standby server cascaded to the original standby server immediately replaces the original standby server and succeeds to detect and monitor all the application servers. Thereby, the multi-agent hot-standby system and the failover method for the same of the present invention can exempt the programs and tasks executed in application servers from interruption. Further, the present invention can enable a server system to tolerate more faults with less standby servers used. | 12-04-2008 |
20080301490 | QUORUM-BASED POWER-DOWN OF UNRESPONSIVE SERVERS IN A COMPUTER CLUSTER - A quorum-based server power-down mechanism allows a manager in a computer cluster to power-down unresponsive servers in a manner that assures that an unresponsive server does not become responsive again. In order for a manager in a cluster to power down servers in the cluster, the cluster must have quorum, meaning that a majority of the computers in the cluster must be responsive. If the cluster has quorum, and if the manager server did not fail, the manager causes the failed server(s) to be powered down. If the manager server did fail, the new manager causes all unresponsive servers in the cluster to be powered down. If the power-down is successful, the resources on the failed server(s) may be failed over to other servers in the cluster that were not powered down. If the power-down is not successful, the cluster is disabled. | 12-04-2008 |
20080301491 | QUORUM-BASED POWER-DOWN OF UNRESPONSIVE SERVERS IN A COMPUTER CLUSTER - A quorum-based server power-down mechanism allows a manager in a computer cluster to power-down unresponsive servers in a manner that assures that an unresponsive server does not become responsive again. In order for a manager in a cluster to power down servers in the cluster, the cluster must have quorum, meaning that a majority of the computers in the cluster must be responsive. If the cluster has quorum, and if the manager server did not fail, the manager causes the failed server(s) to be powered down. If the manager server did fail, the new manager causes all unresponsive servers in the cluster to be powered down. If the power-down is successful, the resources on the failed server(s) may be failed over to other servers in the cluster that were not powered down. If the power-down is not successful, the cluster is disabled. | 12-04-2008 |
20080307250 | MANAGING NETWORK ERRORS COMMUNICATED IN A MESSAGE TRANSACTION WITH ERROR INFORMATION USING A TROUBLESHOOTING AGENT - A method, system, and program for managing network errors communicated in a message transaction with error information using a troubleshooting agent. A network facilitates message transactions between a requester and a responder for facilitating web services. When a non-application specific error occurs in relation to a particular message transaction, such as a network error, a protocol layer assigns an error code and either the requester or responder encodes the error code in the body of an envelope added to the particular message transaction. In particular, the message transaction is an XML message with a Simple Object Access Protocol (SOAP) envelope encoded with the error code to which the XML message is then attached. The error encoded message transaction is forwarded to a troubleshooting agent. The troubleshooting agent facilitates resolution of the non-application specific error and returns a descriptive message indicating the resolution of the non-application specific error to at least one of the requester and the responder. | 12-11-2008 |
20080313491 | METHOD AND SYSTEM FOR PROVIDING CUSTOMER CONTROLLED NOTIFICATIONS IN A MANAGED NETWORK SERVICES SYSTEM - An approach for supporting automated fault isolation and recovery is provided. A notification configuration option is transmitted to a browser interface utilized by a user associated with a customer network that is monitored by a service provider, wherein the user selects the notification configuration option to input notification information. The notification information is received, via the browser interface, from the customer. A notification message is received from a platform configured to create a workflow event in response to an alarm indicative of a fault within the customer network, wherein isolation and recovery of the fault is performed according to the workflow event, the notification message including information about the customer network during the fault isolation and recovery process. The notification message is transmitted in accordance with the stored notification information. | 12-18-2008 |
20090006884 | AUTOMATICALLY MANAGING SYSTEM DOWNTIME IN A COMPUTER NETWORK - Embodiments are provided to automatically managing system downtime in a computer network. In one embodiment, an event is created in an application server to schedule a system downtime period for a web server. When the scheduled downtime occurs, the web server is automatically removed from the network and a downtime notification message is automatically communicated indicating that the web server is offline. In another embodiment, events may be created to schedule downtime for web-based applications, including websites. Prior to the scheduled downtime, requests to a web-based application may be automatically stopped and redirected to a specified location. In another embodiment, the operation of web servers is automatically monitored to detect the presence of a fault condition and, if a fault condition is present, then a determination may be made that the affected web servers are down and requests to the down web servers are automatically redirected to an alternate server. | 01-01-2009 |
20090006885 | Heartbeat distribution that facilitates recovery in the event of a server failure during a user dialog - An exemplary method facilitates automatic recovery upon failure of a server in a network responsible for replying to user requests. Periodic heartbeat information is generated by a first group of servers responsible for replying to user requests. The heartbeat information provides an indication of the current operational functionality of the first group of servers. A second group of servers determines that one of the first servers has failed based on the periodic heartbeat information. The second group of servers is disposed in communication channels between users and the first group of servers. One of the second group of servers receives a message containing a request from a first user having the one of the first group of servers as a destination. One of the second group of servers determines that the message is part of an ongoing dialog of messages between the first user and the one of the first group of servers. Stored dialog information contained in previous communications between the first user and the one of the first group of servers associated with the ongoing dialog is retrieved. Another message is transmitted from the one of the second group of servers to another of the first group of servers. The another message includes the request contained in the message and the retrieved dialog information. This enables the another server to process the request based on the retrieved dialog information without requiring the first user to have to retransmit previously transmitted information that was part of the dialog information. | 01-01-2009 |
20090013210 | Systems, devices, agents and methods for monitoring and automatic reboot and restoration of computers, local area networks, wireless access points, modems and other hardware - An embodiment of the invention is a client on a local area network that periodically and automatically evaluates its physical connectivity with the local area network, exercises local-network services such as DHCP, and verifies Internet connectivity and function by pinging one or more numerically specified IP addresses and by pinging one or more IP addresses specified by an FQDN (Fully Qualified Domain Name) known to the assigned DNS servers. An embodiment of the invention may include a plurality of client elements monitoring one or more networks. Functionality according to embodiments of the invention can send notices, automatically initiate action, and otherwise assist in, among other things, remote monitoring and administration of networks, and particularly wireless networks. | 01-08-2009 |
20090024868 | Business continuation policy for server consolidation environment - A method, computer program product and system that establishes and maintains a business continuity policy in a server consolidation environment. Business continuity is ensured by enabling high availability of applications. When an application is started, restarted upon failure, or moved due to an overload situation, a system is selected best fulfilling the requirements for running the application. These requirements can include application requirements, such as an amount of available capacity to handle the load that will be placed on the system by the application. These requirements can further include system requirements, such as honoring a system limit of a number of applications that can be run on a particular system. Respective priorities of applications can be used to determine whether a lower-priority application can be moved to free resources for running a higher-priority application. | 01-22-2009 |
20090024869 | Autonomous Takeover Destination Changing Method in a Failover - For realizing an optimum failover in NAS, this invention provides a computer system including: a first computer; a second computer; a third computer; and a storage device coupled to the plurality of computers via a network, in which: the first computer executes, upon reception of an access request to the storage device from a client computer coupled to the plurality of computers, the requested access; and transmits to the client computer a response to the access request; the second computer judges whether a failure has occurred in the first computer; obtains load information of the second computer; obtains load information of the third computer from the third computer; and transmits a change request to the third computer when the obtained load information satisfies a predetermined condition; and the third computer judges whether a failure has occurred in the first computer when the change request is received from the second computer. | 01-22-2009 |
20090031166 | WARM REBOOT ENABLED KERNEL DUMPER - In one embodiment, a method of a kernel dumper module includes generating a dump file associated with a kernel when the kernel crashes, storing the dump file to a functional memory upon applying an overwrite protection to a core dump of the dump file, restarting the kernel through a warm reboot of the kernel such that the core dump is not erased from the functional memory, and transferring the core dump to a system file using the kernel. | 01-29-2009 |
20090037763 | Systems and Methods for Providing IIP Address Stickiness in an SSL VPN Session Failover Environment - The SSL VPN session failover solution of the appliance and/or client agent described herein provides an environment for handling IP address assignment and end point re-authorization upon failover. The appliances may be deployed to provide a session failover environment in which a second appliance is a backup to a first appliance when a failover condition is detected, such as failure in operation of the first appliance. The backup appliance takes over responsibility for SSL VPN sessions provided by the first appliance. In the failover environment, the first appliance propagates SSL VPN session information including user IP address assignment and end point authorization information to the backup appliance. The backup appliance maintains this information. Upon detection of failover of the first appliance, the backup appliance activates the transferred SSL VPN session and maintains the user assigned IP addresses. The backup appliance may also re-authorize the client for the transferred SSL VPN session. | 02-05-2009 |
20090049332 | Method and Apparatus for Expressing High Availability Cluster Demand Based on Probability of Breach - A method, apparatus, and computer instructions are provided for expressing high availability (H/A) cluster demand based on probability of breach. When a failover occurs in the H/A cluster, event messages are sent to a provisioning manager server. The mechanism of embodiments of the present invention filters the event messages and translates the events into probability of breach data. The mechanism then updates the data model of the provision manager server and makes a recommendation to the provisioning manager server as to whether reprovisioning of new node should be performed. The provisioning manager server makes the decision and either reprovisions new nodes to the H/A cluster or notifies the administrator of detected poisoning problem. | 02-19-2009 |
20090055679 | Recovery Of A Redundant Node Controller In A Computer System - Recovery of a redundant node controller in a computer system including determining a loss of a heartbeat for a predefined period of time between a system controller and the redundant node controller; in response to determining the loss of the heartbeat for the predefined period of time, checking network connectivity between the system controller and the redundant node controller; if there is network connectivity between the system controller and the redundant node controller, determining whether an application on the redundant node controller is running; and if an application on the redundant node controller is running, resetting the redundant node controller through a primary node controller. | 02-26-2009 |
20090063892 | PROPOGATION BY A CONTROLLER OF RESERVATION MADE BY A HOST FOR REMOTE STORAGE - Provided are a method, system, and article of manufacture wherein a primary controller receives a request from a primary host to set reservations on a primary storage and a secondary storage, wherein the primary host, the primary controller and the primary storage are at a first site, and wherein a secondary host, a secondary controller, and the secondary storage are at a second site. The primary controller sets a first reservation on the secondary storage via a storage area network coupling the secondary storage to the primary controller, wherein the setting of the first reservation causes the secondary storage to be read only for a secondary host. The primary controller sets a second reservation on the primary storage, wherein the setting of the second reservation allows the primary host to perform read and write operations on the primary storage. | 03-05-2009 |
20090063893 | REDUNDANT APPLICATION NETWORK APPLIANCES USING A LOW LATENCY LOSSLESS INTERCONNECT LINK - Redundant application network appliances using a low latency lossless interconnect link are described herein. According to one embodiment, in response to receiving at a first network element a packet of a network transaction from a client over a first network for accessing a server of a datacenter, a layer 2 network process is performed on the packet and a data stream is generated. The data stream is then replicated to a second network element via a layer 2 interconnect link to enable the second network element to perform higher layer processes on the data stream to obtain connection states of the network transaction. In response to a failure of the first network element, the second network element is configured to take over processes of the network transaction from the first network element using the obtained connection states without user interaction of the client. Other methods and apparatuses are also described. | 03-05-2009 |
20090070623 | FAIL-OVER CLUSTER WITH LOAD-BALANCING CAPABILITY - A solution for distributing the workload across the servers ( | 03-12-2009 |
20090077412 | Administering A System Dump On A Redundant Node Controller In A Computer System - Administering a system dump on a redundant node controller including detecting a communications failure between a system controller and the redundant node controller; generating a unique identifier for the communications failure; instructing a primary node controller to provoke a system dump on the redundant node controller; provoking the system dump on the redundant node controller including suspending a processor of the redundant node controller and storing during the suspension of the processor the unique identifier for the communications failure and an instruction to execute the system dump on the redundant node controller; releasing the processor of the redundant node controller from suspension; in response to releasing the processor from suspension, identifying the unique identifier for the communications failure and the instruction to execute the system dump; and executing the system dump including associating the system dump with the unique identifier. | 03-19-2009 |
20090077413 | APPARATUS, SYSTEM, AND METHOD FOR SERVER FAILOVER TO STANDBY SERVER DURING BROADCAST STORM OR DENIAL-OF-SERVICE ATTACK - An apparatus, system, and method are disclosed to failover to a standby server when a primary server is under broadcast storm or denial-of-service (“DoS”) attack. A primary attack sensing module is included to monitor a rate of incoming data from a computer network to a primary server and to determine if the rate of incoming data is above a primary data rate threshold. A standby contact module is included to request a standby data rate status from a standby server in response to the primary attack module determining that the rate of incoming data to the primary server is above the primary data rate threshold. The standby server is connected to the primary server over a private network. The standby data rate status includes a determination by the standby server of whether a rate of data received by the standby server is above a standby data rate threshold. A standby receiver module is included to receive a standby data rate status from the standby server over the private network. A switchover module is included to deactivate the primary server and to send a command to activate the standby server as a primary server in response to the received standby data rate status indicating that the rate of data received by the standby server has not exceeded the standby data rate threshold. | 03-19-2009 |
20090077414 | APPARATUS AND PROGRAM STORAGE DEVICE FOR PROVIDING TRIAD COPY OF STORAGE DATA - An apparatus and program storage device for maintaining data is provided that includes receiving primary data at a first node, receiving mirrored data from a second and third node at the first node, and mirroring data received at the first node to a second and third node. | 03-19-2009 |
20090089609 | CLUSTER SYSTEM WHEREIN FAILOVER RESET SIGNALS ARE SENT FROM NODES ACCORDING TO THEIR PRIORITY - A failover method for a cluster computer system in which a plurality of computers sharing a resource are connected by a heartbeat path for providing each computer with lines for monitoring operations of the other computers and a reset path. Resetting may be conducted based upon a registered priority for resetting the computers. | 04-02-2009 |
20090100289 | Method and System for Handling Failover in a Distributed Environment that Uses Session Affinity - In response to detecting a failed server, subscription message processing of a failover server is stopped. A subscription queue of the failed server is opened. A marker message is published to all subscribers of a particular messaging topic. The marker message includes an identification of the failover server managing the subscription queue of the failed server. Messages within the subscription queue of the failed server are processed. In response to determining that a message in the subscription queue of the failed server is the marker message, the subscription queue of the failed server is closed. Then, the failover server resumes processing of its original subscription queue looking for the marker message, while processing yet unseen messages from the queue. Once the marker message is found in the original subscription queue, normal operation is resumed. | 04-16-2009 |
20090106579 | STORAGE SWITCH SYSTEM, STORAGE SWITCH METHOD, MANAGEMENT SERVER, MANAGEMENT METHOD, AND MANAGEMENT PROGRAM - A switch control system including a storage unit, a switch which logically sets a network topology between the storage unit and a plurality of computers, and a management server which communicates with the switch and the storage unit, wherein the storage unit includes at least one disk; wherein the management server comprises a memory and a processor, wherein the memory holds the network topology which is set by the switch, wherein when a failure is detected in one of the computers currently being used, the processor of the management server refers to the memory to change the network topology for the computer where the failure is detected and another computer which substitutes the computer where the failure is detected, and instructs the switch with the changed network topology so as to cause the switch to logically set the changed network topology, and wherein the management server controls the disk of the computer where the failure is detected to be accessible. | 04-23-2009 |
20090113233 | Testing Disaster Recovery Elements - Testing disaster recovery elements can be performed by configuring a disaster recovery site with network addresses to disaster recovery elements at an application layer. End-to-end operation of the disaster recovery site is verified using the network addresses at the application layer. The disaster recovery site is verified while an associated production site is operating. | 04-30-2009 |
20090113234 | VIRTUALIZATION SWITCH AND METHOD FOR CONTROLLING A VIRTUALIZATION SWITCH - A virtualization switch includes first communication line connection terminals which can be connected to a host computer, a physical storage apparatus or a plurality of physical storage apparatuses, and another virtualization switch, a second communication line connection terminal which can be connected to a single line concentrator or a plurality of line concentrators connected to a manager computer by a second communication line a storage virtualization unit, a first communication unit which can communicate with an other virtualization switch through a first communication line, second communication unit which can communicate with the other virtualization switch through an second communication line, second communication line monitor unit which test communication between the virtualization switch, and abnormal state coping unit which executes a closing process and causes the first communication unit to output a failover designation instruction. | 04-30-2009 |
20090119536 | INTELLIGENT DISASTER RECOVERY FOR DATABASE CONNECTION FAILURES - Embodiments of the invention provide techniques for disaster recovery in the event of a database connection failure. In one embodiment, a network address for a secondary server may be stored in multiple data objects of a client computer. In the event of a failed connection to a primary server, the network address of the secondary server may be retrieved from one of the data objects stored in the client computer. When an updated network address for the secondary server is received, it may be propagated to the data objects of the client computer. | 05-07-2009 |
20090138751 | DE-CENTRALIZED NODAL FAILOVER HANDLING - Embodiments of the present invention provide a method, system and computer program product for de-centralized nodal failover handling in a high availability computing architecture. The system can include multiple different nodes coupled to one another in a cluster over a computer communications network including an initial lead node and remaining auxiliary nodes. The system further can include a messaging service coupled to each of the nodes and nodal failover handling logic coupled to each of the nodes and to the messaging service. The logic can include program code enabled to periodically receive heartbeat messages from the messaging service for the initial lead node and to subsequently detect a lapse in the heartbeat messages, to post within a message to the messaging service a request to become a replacement lead node in response to detecting the lapse in the heartbeat messages, and to periodically post heartbeat messages to the messaging service as the replacement lead node for the initial lead node. | 05-28-2009 |
20090138752 | Systems and methods of high availability cluster environment failover protection - A transparent high-availability solution utilizing virtualization technology is presented. A cluster environment and management thereof is implemented through an automated installation and setup procedure resulting in a cluster acting as a single system. The cluster is setup in an isolated virtual machine on each of a number of physical nodes of the system. Customer applications are run within separate application virtual machines on one physical node at a time and are run independently and unaware of their configuration as part of a high-availability cluster. Upon detection of a failure, traffic is rerouted through a redundant node and the application virtual machines are migrated from the failing node to another node using live migration techniques. | 05-28-2009 |
20090138753 | Server switching method and server system equipped therewith - There is disclosed a high speed switching method for a disk image delivery system fail-over. A management server sends a disk image of an active server in advance to a standby server. When receiving a report that the active server has failed, the management server judges whether or not it is possible for the standby server to perform the service of the failed active server based on service provision management server information held by the management server and if possible, instructs the standby server to perform the service of the active server. Even if the disk image delivered in advance is different from the disk image of the failed active server, switching of is the service to the standby server can be performed more quickly through resetting the setting values of unique information and installing the additional pieces of software on the standby server by the management server than redelivering an appropriate disk image. | 05-28-2009 |
20090144580 | Data Transfer Controlling Method, Content Transfer Controlling Method, Content Processing Information Acquisition Method And Content Transfer System - A method of controlling data transfer, a method of controlling content transfer, a method of obtaining content processing information, and a system for transferring content are provided. The method of controlling data transfer in a data interoperable environment includes: receiving a request for transmitting data from a client; gathering information on entities which are to participate in transmitting data; forming a chain including at least two entities by using the gathered information on the entities; transmitting a plurality of data through the chain; and receiving an event message for representing a transmission status of the data transmitted from at least one of the entities included in the chain. Accordingly, it is possible to control a transmission of the data so that the plurality of data can be transmitted through a single session and to receive the transmission status of the data as an event message. | 06-04-2009 |
20090144581 | Data Transfer Controlling Method, Content Transfer Controlling Method, Content Processing Information Acquisition Method And Content Transfer System - A method of controlling data transfer, a method of controlling content transfer, a method of obtaining content processing information, and a system for transferring content are provided. The method of controlling data transfer in a data interoperable environment includes: receiving a request for transmitting data from a client; gathering information on entities which are to participate in transmitting data; forming a chain including at least two entities by using the gathered information on the entities; transmitting a plurality of data through the chain; and receiving an event message for representing a transmission status of the data transmitted from at least one of the entities included in the chain. Accordingly, it is possible to control a transmission of the data so that the plurality of data can be transmitted through a single session and to receive the transmission status of the data as an event message. | 06-04-2009 |
20090150716 | METHOD FOR MONITORING AND MANAGING A CLIENT DEVICE IN A DISTRIBUTED AUTONOMIC COMPUTING ENVIRONMENT - A stale of a managed client device in a distributed autonomic computing environment is attached to an event occurring on the managed client device. The event is sent, with the attached state of the managed client device, to a server. The state of the managed client device is saved at the server. The event is analyzed for identifying a problem at the client device. An action for solving the problem is generated based on a state of the managed client device at the time the event is analyzed. An execution condition is dynamically generated based on the saved state of the managed client device. The execution condition is added to the action to be executed and sent to the managed client device. At the managed client device, a determination is made whether to execute the action based on the execution condition and a current state of the managed client device. | 06-11-2009 |
20090150717 | AVAILABILITY PREDICTION METHOD FOR HIGH AVAILABILITY CLUSTER - Provided is an availability prediction method for a high availability. The method includes calculating a basic survival probability that the other node survives until a failure on one node of two nodes constituting a cluster is fixed, and determining an optimal number of nodes meeting a preset reference availability probability by calculating an availability probability for a predetermined range of the number of nodes on the basis of the basic survival probability. The method determines the number of nodes in the high availability cluster so as to match a reference availability probability, and is able to accomplish an optimal configuration of a cluster by calculating the availability probabilities for combinations between active node and passive nodes and between head nodes and switches. | 06-11-2009 |
20090150718 | LARGE-SCALE CLUSTER MONITORING SYSTEM, AND METHOD OF AUTOMATICALLY BUILDING/RESTORING THE SAME - Provided are a large-scale cluster monitoring system and a method for automatically building/restoring the same, which can automatically build a large-scale monitoring system and can automatically build a monitoring environment when a failure occurs in nodes. The large-scale cluster monitoring system includes a CM server, a BD server, GM nodes, NA nodes, and a DB agent. The CM server manages nodes in a large-scale cluster system. The DB server stores monitoring information that is state information of nodes in groups. The GM nodes respectively collect the monitoring information that is the state information of the nodes in the corresponding groups to store the collected monitoring information in the DB server. The NA nodes access the CM server to obtain GM node information and respectively collect the state information of the nodes in the corresponding groups to transfer the collected state information to the corresponding GM nodes. The DB agent monitors the monitoring information of the nodes in the groups, which is stored in the DB server, to detect a possible node failure. | 06-11-2009 |
20090150719 | AUTOMATICALLY FREEZING FUNCTIONALITY OF A COMPUTING ENTITY RESPONSIVE TO AN ERROR - Facilitating error handling of computing environments, including those environments having file systems. Responsive to an entity of the computing environment, such as a client of a file system, obtaining at least an indication of an error, a portion of functionality of the entity is automatically frozen. The obtaining is, for instance, responsive to an event of another entity of the computing environment, such as a server of the file system. Eventually, the frozen functionality is thawed allowing the functionality to proceed. | 06-11-2009 |
20090158082 | FAILOVER IN A HOST CONCURRENTLY SUPPORTING MULTIPLE VIRTUAL IP ADDRESSES ACROSS MULTIPLE ADAPTERS - A host enables any adapter of multiple adapters of the host to concurrently support any VIPA of the multiple VIPAs assigned to the host. Responsive to a failure of at least one particular adapter from among the multiple adapters, the host triggers the remaining, functioning adapters to broadcast a separate hardware address update for each VIPA over the network, such that for a failover in the host supporting the multiple VIPAs the host directs at least one other host accessible via the network to address any new packets for the multiple VIPAs to one of the separate hardware addresses of one of the remaining adapters. | 06-18-2009 |
20090158083 | CLUSTER SYSTEM AND METHOD FOR OPERATING THE SAME - Provided are a cluster system, which makes general nodes appear as if they provide seamless services without failure when seen from the outside, and a method for operating the cluster system. The cluster system for operating individual nodes in a distributed management manner includes a board server having a task board registered with a task list, an agent server for managing the task board, and a plurality of general server nodes for performing a corresponding task on the basis of the task list, among which a failed general server node is replaced with another normal general server node. | 06-18-2009 |
20090164835 | Method and system for survival of data plane through a total control plane failure - A system and method for retaining routes in a control plane learned by an inter-domain routing protocol in the event of a connectivity failure between routers. Routers are classified as either route reflectors or originators. A determination is made whether the connectivity failure occurred between a route reflector and an originator, two originators, or two route reflectors. A determination is then made whether to propagate a withdrawal of learned routes based on whether the connectivity failure occurred between a route reflector and an originator, two originators, or two route reflectors. A withdrawal of learned routes is propagated to neighboring routers if the connectivity failure occurred between two originators, or between a route reflector and an originator that is inaccessible via an intra-domain routing protocol. No withdrawal of learned routes is propagated if the connectivity failure occurred between two route reflectors, or between a route reflector and an originator that is accessible via an intra-domain routing protocol. | 06-25-2009 |
20090172463 | METHOD, SYSTEM AND MACHINE ACCESSIBLE MEDIUM OF A RECONNECT MECHANISM IN A DISTRIBUTED SYSTEM (CLUSTER-WIDE RECONNECT MECHANISM) - A method, system and machine accessible medium for validating a plurality of connections to a backend in a distributed system. A connection request requiring access to a backend is processed at a first node of a distributed system. The access to the backend enabled through a connection from a plurality of connections on the first node. The plurality of connections on the first node is validated in response to a connection request failure. A plurality of connections on a second node is validated in response to the connection request failure. | 07-02-2009 |
20090177913 | Systems and Methods for Automated Data Anomaly Correction in a Computer Network - Systems and methods for correcting an anomaly in a target computer that is part of a network of computers. An anomaly is detected in data stored on a target computer and it is determined what corrective data is needed to correct the anomaly. A donor computer with the corrective data is located and requested to provide the corrective data to the target computer. The corrective data is used to correct the anomaly on the target computer and the target computer may acknowledge receipt of the corrective data. In one embodiment, an arbitrator component receives the requests for the corrective data, passes the requests to potential donor computers, and receives the acknowledgements from the target computers. | 07-09-2009 |
20090177914 | Clustering Infrastructure System and Method - A system and method for configuring a cluster of computer nodes to save and restore state in the cluster in the event of node failures. The system and method are implemented through an application programming interface that includes a membership application, a locks application and a dataspace application. The membership application maintains a set of nodes in the cluster. The lock application provides a means for service applications running on the nodes to synchronize access to dataspaces. The dataspaces provide a cluster-wide shared regions in the memory of the cluster members. The API is configured to monitor the cluster members and to coordinate reallocation of a service application if a node running the service application fails. | 07-09-2009 |
20090177915 | Fault Tolerant Symmetric Multi-Computing System - A system enabled for fault-tolerant symmetric multi-computing using a group of nodes is described hereon. A symmetrical group of nodes networked using a reliable, ordered, and atomic group-to-group TCP communication system is used in providing fault-tolerance and single system image to client applications. The communication between the client and the group is standards based. The processing load is shared among a group of nodes with transparent distribution of tasks to application segments. The system is fault-tolerant in that if a node fails remaining replicas if any continue service without disruption of service or connection. Nodes may be added to or retired from the group in a manner transparent to the client as well as server applications. | 07-09-2009 |
20090183023 | METHOD AND APPARATUS FOR TIME-BASED EVENT CORRELATION - A method and apparatus for fault analysis and fault isolation in a system of networked processors by using a central event correlation function and logical fault signature to provide for fault isolation of failed processing elements is presented. This central event correlation method uses asynchronous events from multiple input sources of same and different technologies and time-based fault correlation and ageing to match unique fault signatures and determine levels of fault recovery escalation over time. This mechanism uses an event driven recovery table to recognize a unique fault signature, count and age faults, provide fault threshold based recovery and generate events as needed to drive recovery escalation. | 07-16-2009 |
20090183024 | System Management Infrastructure for Corrective Actions to Servers with Shared Resources - A corrective action method or subsystem for providing corrective actions in a for a computing domain shared among multiple customers wherein different domain resources are shared by different customers, and each customer's corrective action preferences are accommodated differently according a repository of customer preferences. A database may be queried when a fault event or out-of-limits condition is detected for a given shared resource to determine which customers share the resource, determine each affected customer's response preferences, and to perform corrective actions according to those response preferences. For example, three customers may share a particular hard drive in a shared computing system. One customer may prefer to receive an email notice when the drive is nearly full, another may prefer to receive additional allocation of disk space elsewhere, and the third may prefer to receive a written report of space utilization. | 07-16-2009 |
20090193288 | ROUTING TOKEN TRANSFER AND RECOVERY PROTOCOL IN RENDEZVOUS FEDERATION - Systems and methods that provide for assignment and recovery of tokens as part of a plurality of nodes and distributed application framework/network. The assignment component assigns numbers and tasks to candidates and facilitates multiple leader election. Moreover, a recovery component can recover a token for a node that leaves the network (e.g., crashes). Such recovery component ensures consistency, wherein only one server is assigned recovery of the token and associated tasks. | 07-30-2009 |
20090199040 | METHOD AND DEVICE FOR IMPLEMENTING LINK PASS THROUGH IN POINT-TO-MULTIPOINT NETWORK - Methods and devices for implementing Link Pass Through in a point-to-multipoint network in respect of the network reliability field are provided. Embodiments of the present invention are applicable to a network having an access gate, an access device, an aggregation device and a router. When a failure occurs in an active link between the access device and the aggregation device or between the aggregation device and the router, the access device breaks the connection between the access device and the access gate enables a standby link to conduct communication. Advantageously, when a failure occurs in the active link between the access device and the aggregation device or between the aggregation device and the router, the access device may break the connection between the access device and the access gate and the access gate may enable a standby link to conduct communication. Therefore, no matter what type of failure occurs, the embodiments of the present invention may enable a standby link to conduct communication, ensuring thereby communication reliability. | 08-06-2009 |
20090217079 | METHOD AND APPARATUS FOR REPAIRING MULTI-CONTROLLER SYSTEM - A method and apparatus for repairing a multi-controller system is provided. The method includes: starting network boot, by a controller whose system boot fails, to download repair files from a controller whose operation is normal; and repairing, by the controller whose system boot fails, its own system, based on the repair files. The apparatus includes at least two controllers, and a network boot unit coupled to the at least two controllers, configured to start network boot by a controller whose system boot fails. Each controller includes a detection unit, a local boot unit, a repair file downloading unit, and a repairing unit. According to embodiments of the invention, after system boot fails to any controller, system repair may be performed automatically by downloading system files from another controller through network boot. | 08-27-2009 |
20090217080 | DISTRIBUTED FAULT TOLERANT ARCHITECTURE FOR A HEALTHCARE COMMUNICATION SYSTEM - A healthcare communication system includes a first plurality of computer devices, such as patient stations, staff stations, and a master station, that are operable as a nurse call system. The first plurality of computer devices may have core nurse call functionality residing on an embedded computing platform. At least one of the first plurality of computer devices may have a graphical display screen. A second plurality of computer devices may be operable to provide the first plurality of computer devices with additional functionality via software plug-ins that are transmitted to the first plurality of computer devices. The first plurality of computer devices may be interconnected logically and/or physically in a tiered architecture arrangement to provide fault isolation among the tiers so that faults occurring in computer devices of one tier don't affect the operability of computer devices in other tiers and so that faults occurring in any of the second plurality of computer devices don't affect the core nurse call functionality of the first plurality of computer devices. | 08-27-2009 |
20090217081 | System for providing an alternative communication path in a SAS cluster - The present invention is a system and method for supporting an alternative peer-to-peer communication over a network in a SAS cluster when a node cannot communicate with another node through a normal I/O bus (Serial SCSI bus). At startup, driver may establish the alternative path for communication but may not use it as long as there is an I/O Path available. In the present invention, two types of P2P calls, such as event notification calls and cluster operation calls may be supported. | 08-27-2009 |
20090217082 | Method of Achieving High Reliablility of Network Boot Computer System - In a network computer system, recovery may be impossible from a fault when the fault occurs in a network switch in a network or a device such as an external disk device. Provided is a computer system that includes a plurality of servers, a plurality of network, a plurality of external disk devices, and a management computer, in which the management computer detects a fault which is occurred, retrieves an application stop server inaccessible to the used disk due to the fault, retrieves the disk for storing the same contents as contents stored in the disk used by the retrieved application stop server and the external disk device including the disk, retrieves an application resuming server capable of accessing the retrieved external disk device, and transmits an instruction to boot by using the retrieved disk to the retrieved application resuming server. | 08-27-2009 |
20090217083 | FAILOVER METHOD THROUGH DISK TAKEOVER AND COMPUTER SYSTEM HAVING FAILOVER FUNCTION - When a primary server executing a task fails in a computer system where a plurality of servers are connected to an external disk device via a network and the servers boot an operation system from the external disk device, task processing is taken over from the primary server to a server that is not executing a task in accordance with the following method. The method for taking over a task includes the steps of detecting that the primary server fails; searching the computer system for a server that has the same hardware configuration as that of the primary server and that is not running a task; enabling the server, searched for as a result of the search, to access the external disk device; and booting the server from the external disk device. | 08-27-2009 |
20090222687 | METHOD AND SYSTEM FOR TELECOMMUNICATION APPARATUS FAST FAULT NOTIFICATION - A method and system for containing a fault in a network node. A loss of all remaining communication links from a node is detected. A time duration from the loss of a first remaining communication link to the loss of a last remaining communication link is determined. It is established that the node has contained a fault when the time duration for the loss of the first remaining communication link to the loss of the last remaining communication link is not more than a predetermined amount of time. | 09-03-2009 |
20090235111 | DATA TRANSMISSION DEVICE, AND METHOD AND COMPUTER READABLE MEDIUM THEREFOR - A data transmission device, configured to be connected with a plurality of terminal devices via a network, includes a data transmission unit transmitting data sequentially to the terminal devices, a determining unit determining a transmission method for transmitting the data to each of the terminal devices, and a controller configured to, when the data transmission unit fails to transmit the data to a first one of the terminal devices in a first transmission method, control the determining unit to determine a second transmission method for retrying to transmit the data to the first terminal device, and control the data transmission unit to retry to transmit the data to the first terminal device in the second transmission method after the data transmission unit tries to transmit the data to terminal devices other than the first terminal device to which devices among the terminal devices the data has not been successfully transmitted. | 09-17-2009 |
20090240974 | Data replication method - Provided is a data replication method capable of reducing the number of communication times when a processing result of an active system is replicated to a standby system. The data replication method, in which a first computer receives a first message containing a first processing request, and a plurality of second computers replicates the first message, includes the steps of: sending, by a third computer, the first message to the first computer and the second computers; sending, by each of the second computers, a message receive notification of the first message to the first computer; sending, by the first computer, after reception of the message receive notification from the second computers, the message receive notification of the first message to the third computer; and sending, by the first computer, a notification indicating that the first processing request becomes executable by the first computer to the second computers. | 09-24-2009 |
20090249114 | COMPUTER SYSTEM - The computer system is capable of improving performance, reliability and redundancy. The computer system comprises: a plurality of server computers having different functions, the server computers being mutually connected by communication lines; a standby server computer being connected to each of the server computers by the communication lines, the standby server computer being capable of performing the function of each of the server computers; a detection unit for detecting an abnormal state of each of the server computers; and a take-over unit for controlling the standby server computer to take over the action of the abnormal server computer when the abnormal state of the abnormal server computer is detected by the detection unit. | 10-01-2009 |
20090249115 | METHOD AND SYSTEM FOR DYNAMIC LINK FAILOVER MANAGEMENT - The present invention is directed to a method and system for providing redundancy and resiliency features to network devices, such as switches and routers, that do not have built-in redundancy and resiliency features. Health-check messages are periodically transmitted over a first link that transmits network data. Upon detecting a failure of the first link, transmission of network data is switched to a second redundant link, while health check messages continue to be periodically transmitted over the first link. Upon detecting that the first link has been restored, transmission of network data is switched to the first link. | 10-01-2009 |
20090254775 | METHOD FOR ENABLING FASTER RECOVERY OF CLIENT APPLICATIONS IN THE EVENT OF SERVER FAILURE - A system and method are provided for improving recovery times in fallover conditions in a multinode data processing system by sending notification of the failure of a server node, which is acting as server for a client application running on a client node, to the client application. In the present invention, this notification is provided by the fallover node acting as backup for the server node. When a client application receives no response from a server for a long time, it assumes that the server has failed and initiates reconnection. The present invention speeds-up the reconnect initiated by the client application by having system level software proactively notify the client application about the server failure. This results in faster recovery for client applications. | 10-08-2009 |
20090259881 | FAILSAFE RECOVERY FACILITY IN A COORDINATED TIMING NETWORK - A failsafe recovery capability for a Coordinated Timing Network. The recovery capability facilitates recovery when communication is lost between two servers of the coordinated timing network. The capability includes checking another system's status in order to determine what action is to be taken. The status includes the stratum level of the servers and a version number indicating the code level of the servers. | 10-15-2009 |
20090271654 | CONTROL METHOD FOR INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING SYSTEM, AND PROGRAM - An object of the present invention is to ensure, in an information processing system including a plurality of server apparatuses coupled to one another, reliability and availability thereof when failover is executed. In an information processing system, which includes a plurality of server apparatuses coupled to one another, and a management server coupled to the server apparatuses, and is configured to, when detecting occurrence of a failure in an active server apparatus of the server apparatuses, execute failover from the active server apparatus to a standby server apparatus of the server apparatuses after turning on a power supply of the standby server apparatus whose power supply having been turned off, the management server is enabled to: acquire information on the standby server apparatus after turning on the power supply of the standby server apparatus; turn off the power supply of the standby server apparatus after acquiring the information; and, based on the acquired information, judge whether or not failover to the standby server apparatus can be executed. Additionally, an acquisition method of the information can be selected in accordance with an allocated status of the standby server apparatus. | 10-29-2009 |
20090271655 | FAILOVER METHOD, PROGRAM, FAILOVER APPARATUS AND FAILOVER SYSTEM - The invention is to provide a failover method for improving overall reliability of a system including a large number of servers. Priorities set based on operating statuses of respective servers and shared devices capable of communicating with the respective servers in a failover system are stored in a storage portion while the priorities are associated with the respective servers, so that a failure processing portion of a management server retrieves a server based on the priorities and designates the retrieved server as a spare server when a failure occurs in one of the servers. | 10-29-2009 |
20090276657 | MANAGING CLUSTER SPLIT-BRAIN IN DATACENTER SERVICE SITE FAILOVER - A central controlling service for datacenter activation/deactivation control in a cluster deployment to assist in preventing a split-brain scenario. The central controlling service provides a central point of control in the datacenter for application servers to periodically query as to whether to go offline, online, or normal. Redundancy of the central service facilitates detection of datacenter failure by the redundant services interacting to resolve the state of control information. This control information is then used to answer the server queries. On startup from a datacenter failure, a single instance of the central service queries other redundant instance(s) to determine if the single instance is starting up from a datacenter-wide failure or from operations other than total datacenter failure. If the failure is datacenter-wide, a central service protocol assists in resolving to the single service keeping the associated datacenter servers offline; otherwise, the server queries are answered to go online. | 11-05-2009 |
20090276658 | JAVA VIRTUAL MACHINE HAVING INTEGRATED TRANSACTION MANAGEMENT SYSTEM - A computing system is configured to deploy a JAVA application for execution in a distributed manner. The computing system includes a plurality of computing nodes including a domain manager node, the plurality of computing nodes forming a computing domain configured as an administrative grouping of the nodes administered by the domain manager node. The domain manager node is configured to provide, to each of the computing nodes, a main portion of the JAVA application. The main portion defines, for each computing node, a portion of the behavior of the JAVA application to be accomplished by that computing node. Furthermore, each computing node is configured to receive at least one class file having classes appropriate for the portion of the behavior of the JAVA application defined, by the main portion, to be accomplished by that computing node. | 11-05-2009 |
20090282283 | MANAGEMENT SERVER IN INFORMATION PROCESSING SYSTEM AND CLUSTER MANAGEMENT METHOD - An information processing system includes I/O devices, I/O switches each of which is coupled to the I/O devices, multiple server apparatuses which are coupled to the I/O switch and with which a cluster can be constructed, and a management server. In the system, a management server is that: stores an identifier and a coupling port ID of the I/O switch to which any of the server apparatuses and any of the I/O devices are coupled; stores information as to whether or not each of the I/O devices can use loopback function for the heart beat signal; selects one of the I/O devices available for the loopback function in constructing the cluster between the server apparatuses; generates a heart beat path using the selected I/O device as a loopback point; and performs settings on the I/O device. | 11-12-2009 |
20090282284 | RECOVERY SERVER FOR RECOVERING MANAGED SERVER - When the boot monitoring unit receives, from a management server, a first boot response for a first boot request from a NIC | 11-12-2009 |
20090287954 | SLOT INTERFACE ACCESS UNIT, METHOD THEREOF, AND PROGRAM THEREOF, AS WELL AS REDUNDANCY CONFIGURATION OF MAIN UNIT, AND REPLACING METHOD OF THE SAME - A single main unit manages information on hardware resources and the like of all main units connected to a network in an integrated fashion. A slot management module, a slot control module, and a physical slot/managed slot comparison table are provided between an input/output control module and a slot interface subordinate thereto. The input/output control module accesses the slot interface by using virtual slot identification information. The slot management module refers to the physical slot/managed slot comparison table, converts the virtual slot identification information into physical slot identification information, and accesses a slot control module corresponding to the physical slot identification information obtained by the conversion, thereby realizing a physical access of the input/output control module to the slot interface. | 11-19-2009 |
20090287955 | REDUNDANT FAILOVER SYSTEM, REDUNDANCY MANAGING APPARATUS AND APPLICATION PROCESSING APPARATUS - In a communication system using an IP tunnel for communication between application processing apparatuses (hereinafter, processing apparatuses), an application can be moved to an arbitrary processing apparatus, update of tunnel tables included in the respective processing apparatuses is quickly performed, and a buffer for waiting for packets during the table update is made small. A redundancy managing apparatus manages a correspondence between a virtual IP address (VIP) of an application in a communication system and an IP address (RIP) of an processing apparatus to execute the application. The processing apparatus notifies the VIP of the communication partner application of the application to the redundancy managing apparatus. The redundancy managing apparatus notifies the VIP of the communication partner application of the moved application and the RIP of the processing apparatus to execute the communication partner application to the processing apparatus of the movement destination (failover destination) of the application. | 11-19-2009 |
20090292942 | TECHNIQUES FOR DETERMINING OPTIMIZED LOCAL REPAIR PATHS - Techniques for finding an optimized local repair path that may be used to signal a local repair connection for a protected connection. The optimized local repair path starts at a node in the path associated with the protected connection and ends at a merge point node in the path associated with the protected connection that is downstream from the start node. Various techniques may be used for finding an optimized local repair path. | 11-26-2009 |
20090292943 | TECHNIQUES FOR DETERMINING LOCAL REPAIR CONNECTIONS - Techniques for configuring a local repair connection for a protected connection including determining a path for the local repair connection. The path traversed by a local repair connection starts at a node in the path associated with the protected connection and ends at a merge point node in the path associated with the protected connection that is downstream from the start node. In one embodiment, the merge point node may even be more than two hops downstream from the start node in the path associated with the protected connection. The local repair path may include zero or more nodes that are not included in the path associated with the protected connection. Techniques are also described for optimizing the path associated with a local repair connection. | 11-26-2009 |
20090300407 | SYSTEMS AND METHODS FOR LOAD BALANCING VIA A PLURALITY OF VIRTUAL SERVERS UPON FAILOVER USING METRICS FROM A BACKUP VIRTUAL SERVER - The present invention provides methods and systems for performing load balancing via a plurality of virtual servers upon a failover using metrics from a backup virtual server. The methods and systems described herein provide systems and methods for an appliance detecting that a first virtual server of a plurality of virtual servers having one or more backup virtual servers load balanced by an appliance is not available, identifying at least a first backup virtual server of a one or more backup virtual servers of the first virtual server is available, maintaining a status of the first virtual server as available in response to the identification, obtaining one or more metrics from the first backup virtual server of a one or more backup virtual servers, and determining the load across the plurality of virtual servers using the metrics obtained from the first backup virtual server associated with the first virtual server. | 12-03-2009 |
20090307522 | COMMUNICATIONS PATH STATUS DETECTION SYSTEM - A network failover apparatus and method for use in a client-server system. The method includes establishing at least a first and further path between a client and a server. The first path connects the server to the client through a first network and a first interface of the client and the further path connects the server to the client through a further network that is separate from the first network and a further interface of the client. The method also includes reaching the server through the first interface, detecting that the server is no longer reachable through the first interface, and identifying the first interface as failed. The method also includes reaching the server through the further interface after the first interface is identified as failed, testing the first interface to determine whether the server is reachable while the server is reachable through the further interface, and reestablishing a connection to the server through the first interface. | 12-10-2009 |
20090319824 | DATA RECOVERY IN HETEROGENEOUS NETWORKS USING PEER'S COOPERATIVE NETWORKING - A method and apparatus for recovering data, comprising establishing a secondary recovery network with a device, detecting data loss and recovering via the secondary recovery network the lost data from the device, the device having correctly received the data, are described. The lost data was sent in a primary wireless multicast network. A method and apparatus for recovering data, comprising receiving data, establishing a secondary recovery network with device and recovering the lost data via said secondary recovery network, are also described. The received data was sent in a primary wireless multicast network. | 12-24-2009 |
20090327798 | Cluster Shared Volumes - Described is a technology by which a storage volume is shared by cluster nodes of a server cluster. In one implementation, each node includes a redirector that provides shared access to the volume from that node. The redirector routes file system metadata requests from applications and the like through a first (e.g., SMB) communications path to the owning node, and routes file system read and write data to the storage device through a second, high-speed communications path such as direct direct block level I/O. An owning node maintains ownership of the storage device through a persistent reservation mechanism that writes a key to a registration table associated with the storage device. Non-owning nodes write a shared key. The owning node validates the shared keys against cluster membership data, and preempts (e.g., removes) any key deemed not valid. Security mechanisms for controlling access are also described. | 12-31-2009 |
20090327799 | COMPUTER READABLE MEDIUM, SERVER MANAGEMENT METHOD AND DEVICE - A management server is equipped with a manager comprising software including a server management program. A managed server has a built-in system disk capable of being booted, on which an agent is installed that is software for communicating with the manager of the management server to monitor the state of the managed server. When a part other than the disk is failed, the system disk is removed from the managed server and attached to a preliminary server. The manager updated the content in a server management table for managing the information of the managed server, thereby eliminating mismatches of network addresses and the like. | 12-31-2009 |
20100005336 | Method and Device for Exchanging Data on the Basis of the Opc Communications Protocol Between Redundant Process Automation Components - The invention relates to a method and a device for exchanging data on the basis of the OPC communications protocol, wherein at least one OPC-ready server is connected parallel to a master OPC server. an OPC client is connected to the master OPC server over a first communications link and to at least one OPC-ready server via at least one second communications link. The OPC client exchanges data over the first communications link and the master OPC server, and over at least the second communications link and at least one OPC-ready server with automatic devices, and evaluates the data transmitted by the master OPC server and at least one OCT-ready server. In the event of failure, the master OPC server is at least partially switched over from the OPC client to at least one OPC-ready server such that the program continues to run smoothly in the same place. | 01-07-2010 |
20100005337 | Data Transfer and Recovery Process - A backup image generator can create a primary image and periodic delta images of all or part of a primary server. The images can be sent to a network attached storage device and a remote storage server. In the event of a failure of the primary server, the failure can be diagnosed to develop a recovery strategy. Based on the diagnosis, at least one delta image may be applied to a copy of the primary image to generate an updated primary image at either the network attached storage or the remote storage server. The updated primary image may be converted to a virtual server in a physical to virtual conversion at either the network attached storage device or remote storage server and users may be redirected to the virtual server. The updated primary image may also be restored to the primary server in a virtual to physical conversion. As a result, the primary data storage may be timely backed-up, recovered and restored with the possibility of providing server and business continuity in the event of a failure. | 01-07-2010 |
20100017644 | BYZANTINE FAULT TOLERANT DYNAMIC QUORUM USING A TRUSTED PLATFORM MODULE - A method implemented in a computer infrastructure having computer executable code tangibly embodied on a computer readable medium. The computer executable code is operable to dynamically adjust quorum requirements for a voting set V of a server cluster, including a plurality of servers, to ensure that a response of the server cluster to a client request remains Byzantine fault tolerant when at least one of: a failed server of the server cluster is replaced with at least one new server, such that a total set S of servers that have ever been members of the server cluster is increased, and an existing server is removed from the voting set V. | 01-21-2010 |
20100017645 | COMMUNICATION APPARATUS AND CONTROL METHOD - A communication apparatus includes: a transmitting unit, a receiving unit and a control unit. The transmitting unit transmits video data to an external apparatus via a first transmission line. The receiving unit receives a command from the external apparatus via a second transmission line. The control unit that resets the transmitting unit without resetting the receiving unit if a communication error relating to the first transmission line is detected, and resets the receiving unit without resetting the transmitting unit if a communication error relating to the second transmission line is detected. | 01-21-2010 |
20100017646 | CLUSTER SYSTEM AND NODE SWITCHING METHOD - When a first server node fails in a cluster system, a client node device transmits failure detection information to a second server node device. Upon receipt of the failure detection information, the second server node device transmits a survival confirmation request to the first server node device. When receiving no survival confirmation response from the first server node device, the second server node device determines that the first server node device has failed and starts the switching control of a server node device which performs a service process. Upon receipt of failure detection information, the second server node device starts switching control when further receiving failure detection information from another client node device. | 01-21-2010 |
20100017647 | MATCH SERVER FOR A FINANCIAL EXCHANGE HAVING FAULT TOLERANT OPERATION - Fault tolerant operation is disclosed for a primary match server of a financial exchange using an active copy-cat instance, a.k.a. backup match server, that mirrors operations in the primary match server, but only after those operations have successfully completed in the primary match server. Fault tolerant logic monitors inputs and outputs of the primary match server and gates those inputs to the backup match server once a given input has been processed. The outputs of the backup match server are then compared with the outputs of the primary match server to ensure correct operation. The disclosed embodiments further relate to fault tolerant failover mechanism allowing the backup match server to take over for the primary match server in a fault situation wherein the primary and backup match servers are loosely coupled, i.e. they need not be aware that they are operating in a fault tolerant environment. As such, the primary match server need not be specifically designed or programmed to interact with the fault tolerant mechanisms. Instead, the primary match server need only be designed to adhere to specific basic operating guidelines and shut itself down when it cannot do so. By externally controlling the ability of the primary match server to successfully adhere to its operating guidelines, the fault tolerant mechanisms of the disclosed embodiments can recognize error conditions and easily failover from the primary match server to the backup match server. | 01-21-2010 |
20100017648 | COMPLETE DUAL SYSTEM AND SYSTEM CONTROL METHOD - A DB server included in an old operation node corrects a recovery log stored in a recovery log storage unit by using a difference log stored in a difference log storage unit. A duplication control device and a DBMS compare a difference log file stored in the difference log storage unit and a recovery log file stored in the recovery log storage unit, and correct the content of the recovery log file accordingly. | 01-21-2010 |
20100031078 | System and Method For Redirecting A Website Upon The Occurrence Of A Disaster Or Emergency Event - A system and method to deflect DNS inquires to a new IP address for a website prior to or after the occurrence of a natural disaster that damages the computer infrastructure of an organization, thereby interrupting the organization's ability to continue to offer its website information. An automated process is included that alters a zone file on a controlling DNS server in a manner that provides a minimum of disruption to resolver programs attempting to resolve names of deflected computers via Internet DNS. An intelligent monitor program continually, but periodically, surveys the organization's web server to confirm its operations status and intervenes in the DNS resolution structure for that web server in the event that a sustained disruption occurs. | 02-04-2010 |
20100031079 | Restoration of a remotely located server - Methods and apparatus restore data on servers in remote or branch offices utilizing virtual distribution components, such as virtual machines. A failed remotely located server is restored to its previous running state using any server with hardware compatible with the hardware of the failed server, rather than requiring a server with an exact copy of the hardware of the failed server. Virtual distribution components are configured without requiring a reimaging of the entire boot partition and physical distribution partition of a physical server. Application environment state information is restored without requiring a restoration of a full operating system state environment. Constantly supported interfaces of physical distribution components are utilized and a quick restoration of virtual distribution components results. Full system functionality is achieved more quickly than when a full physical system image restoration is required. | 02-04-2010 |
20100037087 | SELF-HEALING CAPABILITIES IN A DIRECTORY SERVER - A novel manner of handling an error or exception caused by the unavailability of a slot or crypto hardware or communication network between server and hardware. As per this scheme, in the event of the unavailability of a particular slot in the hardware, the server may disable the SSL request processing within the server by setting a global “SSL Unavailable” flag. All the existing SSL requests within the server can be en-queued. If the error is because of unavailability of master slot, the server can establish a connection with a backup slot. If the error is because of unavailability of crypto hardware or communication network, then server may start a healer thread that will poll for the state of the hardware. If the exception is because of hardware reset, then server may cleanup earlier connection information and re-establish connection with the hardware, and enable SSL services. | 02-11-2010 |
20100037088 | Intelligent Mobile Device Management Client - Embodiments of an intelligent agent for an OMA DM enabled mobile client device are described. The intelligent agent includes modules for storing management property values in one or more nodes of an OMA DM management tree of the mobile client device. At least some of the management values are analyzed and set in a server computer coupled to the mobile client device over a wireless network. The intelligent mobile client is configured to manage itself based on initial instructions and policies provided by a server that are transferred to the client by the OMA DM protocol. For example, a client might notice that the battery is nearly empty and so it automatically decreases its own backlight illumination level. The intelligent agent defines a set of management properties to include a status property representing a node severity value, and a property group consisting of a rule, an action property representing an action that is executed if the rule is satisfied, and a threshold value that represents a minimum value that is used as a rule parameter. | 02-11-2010 |
20100042869 | UPGRADING NETWORK TRAFFIC MANAGEMENT DEVICES WHILE MAINTAINING AVAILABILITY - A method, system, machine-readable storage medium, and apparatus are directed towards upgrading a cluster by bifurcating the cluster into two virtual clusters, an “old” virtual cluster (old active cluster) and a “new” virtual cluster (new standby cluster), and iteratively upgrading members of the old cluster while moving them into the new cluster. While members are added to the new cluster, existing connections and new connections are seamlessly processed by the old cluster. Optionally, state mirroring occurs between the old cluster and the new cluster once the number of members of the old and new clusters are approximately equal. Once a threshold number of members have been transferred to the new cluster, control and processing may be taken over by the new cluster. Transfer of control from the old cluster to the new cluster may be performed by failing over connectivity from the old cluster to the new cluster. | 02-18-2010 |
20100050011 | FAILURE RECOVERY METHOD - The reliability is improved at a low cost even in a virtualized server environment. The number of spare servers is reduced for improving the reliability and for saving a licensing fee for software on the spare servers. A server system comprises a plurality of physical servers on which a plurality of virtual servers run, a single standby server, a module for detecting an active virtual server, and a module for switching the correspondence of boot disks of virtualization modules for controlling virtual servers to the physical servers. When a physical server fails, the boot disk of the associated virtualization module is connected to a spare server to automatically activate on the spare server those virtual servers which have been active upon occurrence of the failure. | 02-25-2010 |
20100058108 | METHOD FOR ANALYZING FAULT CAUSED IN VIRTUALIZED ENVIRONMENT, AND MANAGEMENT SERVER - To enable easy and quick identification of the location of a fault in a virtualized environment, a physical server | 03-04-2010 |
20100064164 | Autonomic Component Service State Management for a Multiple Function Component - A mechanism is provided for autonomic component service state management for a multiple function component. The mechanism determines whether independent functions within a multiple function service boundary can be serviced. When a single function experiences a failure that requires service, repair, or replacement, the surviving functions notify the service management software of the state of the independent functions. The service management software then determines the state of the overall component and implements the appropriate service method. | 03-11-2010 |
20100064165 | FAILOVER METHOD AND COMPUTER SYSTEM - Provided is a failover method performed in a computer system having a first computer which performs an operation, a plurality of standby computers including a first standby computer and a second standby computer, a second computer which has a management module which manages the first computer and the standby computer, and a third computer which manages start and stop of the standby computer. The method including following steps of processing. The third computer acquires configuration information of the first computer, the second computer and the plurality of standby computers from the management module of the second computer. The third computer determines whether an failure occurred in the second computer. The third computer sets up the management module on the second standby computer based on the acquired configuration information in a case of detecting the failure occurred in the second computer. | 03-11-2010 |
20100064166 | SCALABLE SECONDARY STORAGE SYSTEMS AND METHODS - Exemplary systems and methods in accordance with embodiments of the present invention may provide a plurality of data services by employing splittable, mergable and transferable redundant chains of data containers. The chains and containers may be automatically split and/or merged in response to changes in storage node network configurations and may be stored in erasure coded fragments distributed across different storage nodes. Data services provided in a distributed secondary storage system utilizing redundant chains of containers may include global deduplication, dynamic scalability, support for multiple redundancy classes, data location, fast reading and writing of data and rebuilding of data due to node or disk failures. | 03-11-2010 |
20100064167 | Method and Apparatus for Expressing High Availability Cluster Demand Based on Probability of Breach - A method, apparatus, and computer instructions are provided for expressing high availability (H/A) cluster demand based on probability of breach. When a failover occurs in the H/A cluster, event messages are sent to a provisioning manager server. The mechanism of embodiments of the present invention filters the event messages and translates the events into probability of breach data. The mechanism then updates the data model of the provision manager server and makes a recommendation to the provisioning manager server as to whether reprovisioning of new node should be performed. The provisioning manager server makes the decision and either reprovisions new nodes to the H/A cluster or notifies the administrator of detected poisoning problem. | 03-11-2010 |
20100077249 | RESOURCE ARBITRATION FOR SHARED-WRITE ACCESS VIA PERSISTENT RESERVATION - Described is a technology by which an owner node in a server cluster maintains ownership of a storage mechanism through a persistent reservation mechanism, while allowing non-owning nodes read and write access to the storage mechanism. An owner node writes a reservation key to a registration table associated with the storage mechanism. Non-owning nodes write a shared key that gives them read and write access. The owner node validates the shared keys against cluster membership data, and preempts (e.g., removes) any key deemed not valid. The owner node also defends ownership against challenges to ownership made by other nodes, so that another node can take over ownership if a (formerly) owning node is unable to defend, e.g., because of a failure. | 03-25-2010 |
20100077250 | VIRTUALIZATION BASED HIGH AVAILABILITY CLUSTER SYSTEM AND METHOD FOR MANAGING FAILURE IN VIRTUALIZATION BASED HIGH AVAILABILITY CLUSTER SYSTEM - Provided are a virtualization based high availability cluster system and a method for managing failures in a virtualization based high availability cluster system. The high availability cluster system includes a plurality of virtual nodes, and a plurality of physical nodes each including a message generator for generating a message denoting that the virtual nodes are in a normal state and transmitting the generated message to virtual nodes in a same physical node. One of the virtual nodes not included in a first physical node among the plurality of the physical nodes takes over resources related to a service if a failure is generated in one of virtual nodes included in the first physical node | 03-25-2010 |
20100077251 | METHOD AND SYSTEM FOR RELIABLY AND EFFICIENTLY TRANSPORTING DATA OVER A NETWORK - A data transport system for transporting data between a server ( | 03-25-2010 |
20100083032 | CONNECTION BROKER ASSIGNMENT STATUS REPORTING - In one embodiment a computing system comprises one or more processors, a display device coupled to the computing system, and a memory module communicatively connected to the one or more processors. The memory module comprises logic to receive, in a connection server, a service request from a user via a remote connection client; in response to the service request, instantiate a remote computing protocol in a computing resource, monitor a connection state between the remote connection client and the computing resource; and in response to a change in the connection state between the remote connection client and the computing resource, generate a connection state message, and transfer the connection state message to the remote connection client. | 04-01-2010 |
20100083033 | DEVICE AND METHOD FOR AUTOMATICALLY DETERMINING A NETWORK ELEMENT FOR REPLACING A FAILING NETWORK ELEMENT - A device (D), intended for working for at least one network (N), comprises i) an ontology agent (OA) storing at least one ontology defining representations of network elements and relations between these network elements, and ii) a processing means (PM) arranged, when a status of a network element indicates that the latter is failing, for accessing to the ontology agent (OA) to get the representation of the failing network element and relations between this failing network element and at least one other network element, then for determining for each of these other network elements a parameter value representative of a functional likeness with the failing network element from their respective ontology representations, and for determining amongst the other network elements the one offering the parameter value representative of the greatest functional likeness in order to propose to replace the failing network element with this determined network element. | 04-01-2010 |
20100083034 | INFORMATION PROCESSING APPARATUS AND CONFIGURATION CONTROL METHOD - An information processing apparatus for providing a plurality of services by a plurality of software programs, includes: a plurality of hardware resources; a storage unit that stores priorities of the services; a processor that controls configuration of the hardware resources in accordance with a process including: partitioning the plurality of hardware resources into a plurality of groups each of which executes each of the software programs; determining, upon detecting a failure in at least one of the hardware resources in at least one of the groups, another hardware resource which belongs to another group for executing another software programs on the basis of the priorities of services provided by the software programs in reference to the storage unit; and assigning the another hardware resource to the group which includes the one of the hardware resources having the failure so as to renew configuration of the hardware resources. | 04-01-2010 |
20100083035 | METHOD FOR WIRELESS COMMUNICATION IN WIRELESS SENSOR NETWORK ENVIRONMENT - Provided is a wireless communication method in a wireless sensor network environment. The method overhears a packet transmitted from a source sensor node to a destination sink node and determines whether the destination sink node receives the packet. A transmission node selected by using local information among a plurality of neighboring sensor nodes transmits the overheard packet to the destination sink node when the packet is not received. | 04-01-2010 |
20100100761 | MAINTAINING A PRIMARY TIME SERVER AS THE CURRENT TIME SERVER IN RESPONSE TO FAILURE OF TIME CODE RECEIVERS OF THE PRIMARY TIME SERVER - A primary time server of a Coordinated Timing Network remains as current time server, even if time code information of the primary time server is unavailable. The primary time server receives the necessary or desired timing information from a secondary time server and uses that information to maintain time synchronization within the Coordinated Timing Network. | 04-22-2010 |
20100100762 | BACKUP POWER SOURCE USED IN INDICATING THAT SERVER MAY LEAVE NETWORK - A server of a network of servers determines that its power source is failing. In response, the server communicates to one or more other servers of this network that it is leaving the network. This communication is powered by another power source. | 04-22-2010 |
20100106999 | TECHNIQUES FOR DETERMINING LOCAL REPAIR PATHS USING CSPF - Techniques for computing a path for a local repair connection to be used to protect a connection traversing an original path from an ingress node to an egress node. The computed path originates at a node (start node) in the original path and terminates at another node (end node) in the original path that is downstream from the start node. A Constraint Shortest Path First (CSPF) algorithm may be used to compute the path. The computed path is such that it satisfies one or more constraints and does not traverse a path from a first node in the original path to a second node in the original path, wherein the first and second nodes are upstream from the start node in the original path and the second node is downstream from the first node in the original path. A local repair connection may then be signaled using the computed path. | 04-29-2010 |
20100107000 | Active Link Verification For Failover Operations In A Storage Network - Systems and methods for active link verification for failover operations in a storage network are disclosed. An exemplary method includes issuing a command from a first port of a storage device to a local network device in the storage network. The method also includes receiving a response to the command at the first port of the storage device from the local network device in the storage network. The storage device fails over to a second port of the storage device if no response is received at the first port of the storage device. | 04-29-2010 |
20100107001 | Activating Correct Ad-Splicer Profile In Ad-Splicer Redundancy Framework - In particular embodiments, method and system for detecting a failure of a primary ad-splicer, conveying a failure information for the failed primary ad-splicer to a redundant ad-splicer, dynamically forwarding one or more pre-spliced packets intended for the failed primary ad-splicer to the redundant ad-splicer, receiving one or more post-spliced packets from the redundant ad-splicer, and transmitting the post-spliced packets towards one or more target receivers are provided. | 04-29-2010 |
20100107002 | FAILURE NOTIFICATION IN RENDEZVOUS FEDERATION - Systems and methods that supply a global knowledge on what nodes are available in the system, via employing routing tokens that are analyzed by a centralized management component to infer status for the nodes. When nodes fail, the routing tokens associated therewith are acquired by neighboring nodes, and the global knowledge updated. Moreover, upon inferring a failed or down status for a node, a challenge can be sent to a node reporting such failure to verify actual failure(s). | 04-29-2010 |
20100115326 | FAULT-TOLERANT SYSTEM FOR DATA TRANSMISSION IN A PASSENGER AIRCRAFT - The invention relates to a transmission system for the transmission of communications data from at least one data source ( | 05-06-2010 |
20100115327 | CONGESTION CONTROL METHOD FOR SESSION BASED NETWORK TRAFFIC - A method includes establishing an expected traffic load for a plurality of servers, wherein each server has a respective actual capacity. The method further includes limiting the actual capacity of each server to respective available capacities, wherein a combined available capacity that is based on the available capacities corresponds to the expected traffic load. The method also includes dynamically altering the respective available capacity of the servers based on the failure of at least one server. | 05-06-2010 |
20100115328 | Methods and Systems for Handling Device Failovers during Test Executions - A method of generating a test script to test a first device in a target system is disclosed. The method includes generating error handling programming instructions to identify failure of the first device, and generating programming instructions to route test commands in the test scripts to a second device in the target system. The second device is a failover device that takes over operations of the first device when the first device fails. | 05-06-2010 |
20100122112 | System and Method for Communication Error Processing in Outside Channel Combination Environment - Provided are a system and method for processing communication errors in an outside channel combination environment. The system includes: first and second outside-affairs servers connected with a plurality of user terminals and having respective outside-affairs processing applications to perform outside affairs associated with a plurality of outside authorities; first and second outside channel combination servers for processing outside affairs associated with the outside authorities, the first and second outside channel combination servers having respective message relaying and communication applications to interwork with the first and second outside-affairs servers; first and second active and standby network devices respectively connected in parallel with the first and second outside channel combination servers, the first and second active network devices performing normal outside affairs, and the first and second standby network devices being activated when a communication error is generated to perform the normal outside affairs; and first and second switching devices respectively provided between the first and second active and standby network devices and the outside authorities to selectively connect the first and second active and standby network devices when the communication error is generated. Thus, communication errors can be minimized and system resources can be efficiently managed by distributing system loads. | 05-13-2010 |
20100125748 | REMOTE DATA ACCESS DEVICE AND REMOTE DATA ACCESS METHOD - A remote data access device and a remote data access method are provided. The remote data access device comprises: a storing module, a first controller and a second controller. The first controller comprises a first odd and a first even access queue. The second controller comprises a second odd and a second even access queue. The first and the second controller have the same MAC and IP addresses. The first and the second odd access queues receive the same odd remote access requests and the first and the second even access queues receive the same even remote access requests, the first controller performs a data access operation on the storing module according to the odd remote access requests in the first odd access queue and the second controller perform a data access operation on the storing module according to the even remote access requests in the second even access queue. | 05-20-2010 |
20100125749 | COMPUTER PROGRAM PRODUCT, FAILURE DIAGNOSIS METHOD, AND COMMUNICATION APPARATUS - A first confirmation unit performs a first confirmation of confirming presence of a connection to a correspondent node in a first communication layer, presence of a connection to a correspondent node in a second communication layer, and presence of a connection to a correspondent node in a third communication layer. A locating unit locates a failure point in communication based on a result of the first confirmation. An output unit outputs a result of the location. | 05-20-2010 |
20100138686 | FAILURE RECOVERY METHOD, FAILURE RECOVERY PROGRAM AND MANAGEMENT SERVER - In a computer system, including server apparatuses such as an active server and a standby server connected to a storage apparatus, when the active server fails, a management server changes over connection to the storage apparatus from the active server to standby server to thereby hand over operation to the standby server. The management server refers to a fail-over strategy table in which apparatus information of the server apparatuses is associated with fail-over methods to select fail-over strategy in consideration of apparatus information of the active and standby servers. | 06-03-2010 |
20100138687 | RECORDING MEDIUM STORING FAILURE ISOLATION PROCESSING PROGRAM, FAILURE NODE ISOLATION METHOD, AND STORAGE SYSTEM - Slices obtained by dividing the real data storage area of the storage device by segments are assigned to each of a plurality of segments obtained by dividing a virtual logical volume as a primary slice storing data of the segment as a destination of access made by an access node and/or a secondary slice that mirrors and stores data of the primary slice. Management information associates the segment with the primary slice and the secondary slice. A survival signal transmitted at predetermined intervals while a computer is normally operating is monitored. The computer from which the survival signal is not detected over a predetermined time period is detected as a failure node. The failure node is checked against the management information, the managed slice is set as a single primary slice that is an access destination of the access node for which the mirroring is stopped. The failure node is isolated. | 06-03-2010 |
20100138688 | MANAGING SERVICE LEVELS ON A SHARED NETWORK - Devices and methods for modeling and analysis of services provided over a common network include a processor configured to track services connected to the common network through nodes and links; run service models associated with the services under selected conditions, the selected conditions including failure and repair of one of the nodes or links; and propose corrective action and/or change of network resources of the common network to minimize impact of the failure. The processor may also run Network models. The models may be executed successively or simultaneously, and outputs of one model may be used as input to other models, including any necessary conversions for compatibility. | 06-03-2010 |
20100138689 | COMPUTER SYSTEM, MANAGEMENT METHOD AND STORAGE NETWORK SYSTEM - A computer system wherein, when a state of the primary host computer is in an active state, a data sent from the primary host computer to the first storage system is copied through a first copy route which includes a route from the first storage system to the second storage system and a route from the second storage system to the third storage system, wherein, if a failure occurs in the primary host computer and a state of the second host computer is to be in an active state, a data sent from the secondary host computer to the second storage system is copied through a second copy route which includes a route from the second storage system to the first storage system and a route from the first storage system to the third storage system. | 06-03-2010 |
20100146326 | SYSTEMS AND METHODS FOR MANAGING NETWORK COMMUNICATIONS - First and second management systems are placed in active and standby modes, respectively. The first and second management systems are configured to be in communication with the second and the first management systems, respectively and with a plurality of network devices. A first number of the plurality of network devices in communication with the first management system is determined at the first management system and transmitted to the second management system. A second number of the plurality of network devices in communication with the second management system is determined at the second management system and transmitted to the first management system. A first determination is made regarding whether the first number of network devices is less than the second number of network devices at the first management system. A second determination is made regarding whether the first number of network devices is less than the second number of network devices at the second management system. The first and second management systems are placed in failure mode and active mode, respectively based on the first and second determinations. | 06-10-2010 |
20100146327 | SERVER FAILOVER CONTROL METHOD AND APPARATUS AND COMPUTER SYSTEM GROUP - Computer systems forming the computer system group each have a plurality of servers, a plurality of I/O devices, a plurality of servers, and one or more I/O switches coupled to the plurality of I/O devices, and it is possible to change the combination of the servers and I/O devices in each of the computer systems. If a fault has occurred in one of the current servers, firstly, it is judged whether or not there exists a spare server capable of taking over processing in a computer system including the server which has generated the fault. If the judgment result is negative, then a server switching process is carried out by searching another computer system capable of constructing a particular combination corresponding to the combination of the faulty server and the I/O devices allocated thereto. | 06-10-2010 |
20100146328 | GRID STORAGE SYSTEM AND METHOD OF OPERATING THEREOF - There is provided a storage system comprising a plurality of disk units adapted to store data at respective ranges of logical block addresses (LBAs), said addresses constituting an entire address space, and a storage control grid operatively connected to the plurality of disk units and comprising a plurality of data servers, each server comprising operatively coupled cache memory and non-volatile memory. The method of operating the storage system comprises: a) configuring a first server among said plurality of data servers to have a primary responsibility for handling requests directed to a certain range of LBAs; b) continuously obtaining by the first server, information indicative of configuration and/or changes thereof related to said certain data range, thus giving rise to configuration metadata; c) saving said configuration metadata and/or derivatives thereof at one or more disk units among said plurality of disk units in accordance with a predefined criterion; d) continuously saving in cache memory of the first server said configuration metadata obtained between said savings at disk units, thus giving rise to recent configuration changes metadata; e) destaging the recent configuration changes metadata to non-volatile memory of the first server if the first server fails. | 06-10-2010 |
20100153770 | REAL-TIME IMAGE MONITORING AND RECORDING SYSTEM AND METHOD - A real-time image monitoring and recording system includes a plurality of IP cameras and a plurality of surveillance servers. The IP cameras and the surveillance servers can process anycast packets. The surveillance servers control the IP cameras and store data generated by the IP cameras. Peer-to-peer connection exists between the surveillance server and its neighboring surveillance servers. Any surveillance server stores configuration data of its neighboring surveillance servers rather than the configuration data of all the surveillance servers. | 06-17-2010 |
20100153771 | PEER-TO-PEER EXCHANGE OF DATA RESOURCES IN A CONTROL SYSTEM - System(s) and method(s) are provided for peer-to-peer exchange of data in a control system. Decentralized storage and multi-access paths provide complete sets of data without dependence on a specific or pre-defined data source or access paths. Data is characterized as data resources with disparate granularity. The control system includes a plurality of layers that act as logic units communicatively coupled through access network(s). Server(s) resides in a service layer, whereas client(s) associated with respective visualization terminal(s) are part of a visualization layer. Peer-to-peer distribution of data resource(s) can be based on available access network(s) resources and optimization of response time(s) in the control system. When client requests a data resource, all the locations of the data resource and the quickest source to retrieve it are automatically determined. The client stores copy of data resource. Peer-to-peer distribution of data resource(s) can be implemented within the service layer or the visualization layer. | 06-17-2010 |
20100162031 | STORAGE AVAILABILITY USING CRYPTOGRAPHIC SPLITTING - A secure storage appliance is disclosed, along with methods of storing and reading data in a secure storage network. In one aspect, a method includes assigning a volume to a primary secure storage appliance located in a secure data storage network, the secure data storage network including a plurality of secure data paths between the primary secure storage appliance and a client device and a plurality of secure data paths between the secure storage appliance and a plurality of storage systems, the volume corresponding to physical storage at each of the plurality of storage systems. The method also includes detecting a connectivity problem on at least one of the secure data paths. The method further includes assessing whether to reassign the volume to a different secure storage appliance based upon the connectivity problem. | 06-24-2010 |
20100162032 | STORAGE AVAILABILITY USING CRYPTOGRAPHIC SPLITTING - Methods and systems for maintaining data connectivity in a secure data storage network are disclosed. In one aspect, a method includes assigning a volume to a primary secure storage appliance located in a secure data storage network the primary secure storage appliance selected from among a plurality of secure storage appliances located in the secure data storage network, the volume presented as a virtual disk to a client device and mapped to physical storage at each of a plurality of storage systems. The method further includes detecting at one of the plurality of secure storage appliances a failure of the primary secure storage appliance. The method also includes, upon detecting the failure of the primary secure storage appliance, reassigning the volume to a second secure storage appliance from among the plurality of secure storage appliances, thereby rendering the second secure storage appliance a new primary secure storage appliance. | 06-24-2010 |
20100162033 | ETHERNET APPARATUS CAPABLE OF LANE FAULT RECOVERY AND METHODS FOR TRANSMITTING AND RECEIVING DATA - An Ethernet apparatus for performing lane fault recovery is provided. An Ethernet apparatus capable of lane fault recovery includes a data transmitter using a backup lane in the transport link to transmit data intended to be transmitted via the faulty lane when at least one faulty lane is detected from the data transfer lanes, and a data receiver recognizing the data received via the backup lane as data transferred via the faulty lane when the faulty lane is detected. In an Ethernet apparatus having a multi-lane structure, a lane fault and faulty lanes can be accurately recognized while maintaining compatibility with a standard Ethernet apparatus. | 06-24-2010 |
20100162034 | System and Method for Providing IP PBX Service - A telecommunications system has been developed that includes an on premise IP PBX and an offsite hosted IP PBX. The system includes software that allows migration of data between the PBXs as needed by system requirements. | 06-24-2010 |
20100162035 | Multipurpose Storage System Based Upon a Distributed Hashing Mechanism with Transactional Support and Failover Capability - A multipurpose storage system based upon a distributed hashing mechanism with transactional support and failover capability is disclosed. According to one embodiment, a system comprises a client system in communication with a network, a secondary storage system in communication with the network, and a supervisor system in communication with the network. The supervisor system assigns a unique identifier to a first node system and places the first node system in communication with the network in a location computed by using hashing. The client system stores a data object on the first node system. | 06-24-2010 |
20100162036 | Self-Monitoring Cluster of Network Security Devices - A computing device may be joined to a cluster by discovering the device, determining whether the device is eligible to join the cluster, configuring the device, and assigning the device a cluster role. A device may be assigned to act as a cluster master, backup master, active device, standby device, or another role. The cluster master may be configured to assign tasks, such as network flow processing to the cluster devices. The cluster master and backup master may maintain global, run-time synchronization data pertaining to each of the network flows, shared resources, cluster configuration, and the like. The devices within the cluster may monitor one another. Monitoring may include transmitting status messages comprising indicators of device health to the other devices in the cluster. In the event a device satisfies failover conditions, a failover operation to replace the device with another standby device, may be performed. | 06-24-2010 |
20100180147 | APPARATUS, SYSTEM, AND METHOD FOR LINK MAINTENANCE - An apparatus, system, and method are disclosed for link maintenance. A plurality of state machines operate a plurality of first links between data management nodes with each first link in an online state. A transition module transitions the plurality of first links from the online state to a degraded state and from the online state to an offline pending state in response to an offline request. The transition module further transitions the plurality of first links from the degraded state to an online pending state when a degraded link time interval expires and from the offline pending state to an offline state if all pending tasks on the plurality of first links are completed. The transition module further transitions the plurality of first links from the online pending state to the online state if each first link is validated. | 07-15-2010 |
20100180148 | TAKE OVER METHOD FOR COMPUTER SYSTEM - A proposed fail over method for taking over task that is preformed on an active server to a backup server, even when the active server and the backup server have different hardware configuration. The method for making a backup server take over task when a fault occurs on a active server, comprises steps of acquiring configuration information on the hardware in the active server and the backup server, acquiring information relating the hardware in the backup server with the hardware in the active server, selecting a backup server to take over the task that is executed on the active server where the fault occurred, creating logical partitions on the selected backup server, and taking over the task executed on the active server logical partitions, in the logical partitions created on the selected backup server. | 07-15-2010 |
20100185894 | SOFTWARE APPLICATION CLUSTER LAYOUT PATTERN - In forming a cluster of processors and applications among a plurality of processors connected in a network the embodiment of a pair of cluster nodes, as applications, in each server/system and arranging for communication between them in a ring or tiered ring configuration of servers/systems provides network status monitoring and failure recovery in a highly redundant and flexible manner at increased speed without the requirement of separate communication links for monitoring and control or redundant hardware such as a so-called “hot standby” processor in each server/system. | 07-22-2010 |
20100199122 | Cache Validating SCIT DNS Server - A cache validating SCIT-DNS Server including a server cluster, a cache copy, a controller and a validation module. Each of the servers in the server cluster uses a DNS mapping cache which maps DNS name(s) to record entry(ies). The cache copy maintains an image of DNS mapping cache(s). The controller manages the state of servers. States include a live spare state; an exposed state; a quiescent state; and a self-cleansing state. The validation module validates DNS entry(s) using a retriever module and a comparisons module. Retriever module retrieves an independent record entry associated with a selected DNS name from an external DNS resolver. The comparison module compares the independent record entry retrieved by the retriever module with the record entry associated with the selected DNS name residing in the cache copy. The validation module may cause server(s) to take an affirmative action in response to detected validation error(s). | 08-05-2010 |
20100199123 | Distributed Storage of Recoverable Data - A system, method, and computer program product replace a failed node storing data relating to a portion of a data file. An indication of a new storage node to replace the failed node is received at each of a plurality of available storage nodes. The available storage nodes each contain a plurality of shares generated from a data file. These shares may have been generated based on pieces of the data file using erasure coding techniques. A replacement share is generated at each of the plurality of available storage nodes. The replacement shares are generated by creating a linear combination of the shares at each node using random coefficients. The generated replacement shares are then sent from the plurality of storage nodes to the indicated new storage node. These replacement shares may later be used to reconstruct the data file. | 08-05-2010 |
20100205480 | Restarting Networks - A method of restarting a network ( | 08-12-2010 |
20100205481 | Method and Apparatus for Detecting and Handling Peer Faults in Peer-to-Peer Network - A method and apparatus for detecting and handling peer faults in a Peer-to-Peer network are disclosed. A peer in the P2P network receives a diagnosis request message; then detects whether a peer is faulty according to a preset fault detection method; and sends a diagnosis response message to a source peer that constructs the diagnosis request message or a peer specified by the diagnosis request message, where the diagnosis response message carries a detection result, and the detection result carries information about the faulty peer if the peer is detected as faulty. Through the technical solution under the present invention, the faults of the peers on the forwarding path in the P2P network can be detected. | 08-12-2010 |
20100211817 | COMMON CHRONICS RESOLUTION MANAGEMENT - Systems and methods for managing problems that are determined to be chronic problems with network devices or circuits are disclosed. The systems and methods receive data indicating a problem with a network device or circuit and determine based on the data a first action to be performed on the network device or circuit. Upon determining that a recurring problem exists for the network device or circuit, a rule set is used to determine if the data indicates a chronic problem. Upon determining that a chronic problem exists for the network device or circuit, the rule set is used to determine a monitoring period for the network device or circuit. Further, within the monitoring period a performance indicator that indicates that the network equipment or circuit is performing acceptably or unacceptably is used to determine further actions for the network device or circuit. | 08-19-2010 |
20100211818 | Audiovisual distribution system for playing an audiovisual piece among a plurality of audiovisual devices connected to a central server through a network - An audiovisual distribution system includes a central server and a plurality of audiovisual units. Each unit includes structure for interactively communicating with the user for selecting a piece or a menu, a payment device, a computer network card, and a permanent semiconductor memory containing a multitask operating system comprising at least a hard disc access management task. The order for performing a selected piece is processed as a hard disc sequential access task. The hard disc is declared as a peripheral corresponding to the network card of the unit, enabling a request to be sent through the network to the server for processing. | 08-19-2010 |
20100218033 | ROUTER SYNCHRONIZATION - Example systems and methods associated with router synchronization are described. One example method includes reducing a likelihood that a first network device will be favored over a peer device as a router. This likelihood may be increased after the first network device has received a threshold amount of routing information from the peer device. This may allow the first network device to begin performing non-routing related tasks after it starts up without causing interruption of data streams for which the first network device does not have current routing information. | 08-26-2010 |
20100218034 | Method And System For Providing High Availability SCTP Applications - A method and system for providing high availability services to SCTP applications is disclosed. In one embodiment, a high availability (HA) server system includes an active server and a standby server with a primary redundancy module and a secondary redundancy module, respectively, which are operable for performing a method including forming a control channel between the active server and the standby server, forwarding IP addresses of the active server and the standby server to a client device when an association between the client device and the active server is established, synchronously mirroring a state of a SCTP stack and a state of an application of the active server to the standby server using the control channel, and servicing the client device using the standby server based on the state of the SCTP stack and the state of the application if a failure of the active server is detected. | 08-26-2010 |
20100218035 | Self-testing and -repairing fault-tolerance infrastructure for computer systems - ASICs or like fabrication-preprogrammed hardware provide controlled power and recovery signals to a computing system that is made up of commercial, off-the-shelf components—and that has its own conventional hardware and software fault-protection systems, but these are vulnerable to failure due to external and internal events, bugs, human malice and operator error. The computing system preferably includes processors and programming that are diverse in design and source. The hardware infrastructure uses triple modular redundancy to test itself as well as the computing system, and to remove failed elements—powering up and loading data into spares. The hardware is very simplified in design and programs, so that bugs can be thoroughly rooted out. Communications between the protected system and the hardware are protected by very simple circuits with duplex redundancy. | 08-26-2010 |
20100218036 | PUSHING DOCUMENTS TO WIRELESS DATA DEVICES - A system pushes documents to one or more wireless data devices. The system receives a push request from a user to push a specified document to one or more identified wireless data devices. The system then constructs a wireless gateway server request for each identified wireless data device, and the document is subsequently pushed to the devices. | 08-26-2010 |
20100223492 | NODE FAILURE DETECTION SYSTEM AND METHOD FOR SIP SESSIONS IN COMMUNICATION NETWORKS - The present invention relates to a failure detection method and system operating at the session control layer, preferably within an IMS/SIP architecture, which monitors the status of an adjacent node with the aid of a timer mechanism that sets a heartbeat rate associated with that adjacent node. Monitoring of a communication session takes place by monitoring the liveliness of the nodes handling the session. | 09-02-2010 |
20100223493 | Method for the Consistent Provision of Configuration Data in an Industrial Automation System Comprising a Plurality of Networked Control Units, and Industrial Automation System - For the consistent provision of configuration data in an industrial automation system comprising a plurality of networked control units, components of a service are combined by a local service configuration unit using a standard configuration interface to form a service. Services are configured by configuration data and activated, where the configuration data comprise information relating to the attribution of services to control units and dependencies between services. The configuration data are accepted from a control and monitoring unit in the industrial automation system by a system configuration service, checked and transmitted to destination control units. The transmitted configuration data are checked by local service configuration units associated with the destination control units for changes in comparison with previously used configuration data. The local service configuration units use detected changes in the configuration data to ascertain lists of operations for performing configuration changes, where the lists are optimized to minimize service downtimes. | 09-02-2010 |
20100223494 | SYSTEM AND METHOD FOR PROVIDING IP PBX SERVICE - A telecommunications system has been developed that includes an on-premise IP PBX and an offsite hosted IP PBX. The system includes logic that automatically and/or manually migrates system configuration data from the on-premise IP PBX to the offsite IP PBX during a failure of the on-premise IP PBX. | 09-02-2010 |
20100229026 | Method and Apparatus for Cluster Data Processing - A cluster data processing method and system based on a unique identity control without requiring continuous network connection between the servers in the cluster and an external computer. The cluster sends a first data containing a controlling identity record to the external computer. The controlling identity record includes a unique identity and a control information. A load-balancing device of the cluster receives from the external computer a second data, which contains a controlling identity record corresponding to that of the first data. The cluster routes the second data according to the control information in the controlling identity record of the second data. The disclosed method and system may help avoid the overload problem of server resources and prevent low performance caused by continuous network connection that has to be maintained between the cluster and external server | 09-09-2010 |
20100229027 | IMS RECOVERY AFTER HSS FAILURE - The present invention is aimed to provide means and methods for recovery of an IP Multimedia Subsystem ‘IMS’ where a Home Subscriber Server ‘HSS’ holding subscriber data for subscribers of the IMS has suffered a restart. A first method of recovery is applied after detecting a HSS restart, and as receiving a registration from a given subscriber or an invitation to communicate with a given subscriber from another subscriber. A second method of recovery is applied after detecting a HSS restart, and as receiving a request from a given subscriber at a S-CSCF previously assigned for serving the given subscriber in the IMS. Both first and second methods may be applied in the IMS separately, or in cooperation, following a best effort to ensure that, whatever event occurs first, receiving a registration from a given subscriber, or an invitation to communicate with a given subscriber, or receiving a request from a given subscriber at a S-CSCF previously assigned, the actions triggered by the first method or by the second method achieve the recovery of the IMS without further actions not applied yet from said second or first methods. | 09-09-2010 |
20100229028 | INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING SYSTEM, AND CONTROL METHOD THEREFOR - A computer unit includes computer components each having a processor and a computer component controller for controlling the operations of the computer components by communicating a control signal with the computer components through a radio transmission path. Another computer unit has a similar configuration. At an instruction from an external control terminal which has recognized the fault of another computer component controller provided for the another computer unit, the computer component controller concurrently controls the operations of computer components provided for the another computer unit by communicating a control signal through a radio transmission path. | 09-09-2010 |
20100235676 | METHOD AND SYSTEM FOR PROCESSING EMAIL DURING AN UNPLANNED OUTAGE - The method and system of the present invention provides an improved technique for processing email during an unplanned outage. Email messages are redirected from the primary server to a secondary server during an unplanned outage such as, for example, a natural disaster. A notification message is sent to users alerting them that their email messages are available on the secondary server by, for example, Internet access. After the termination of the unplanned outage, email messages received during the unplanned outage are synchronized into the users standard email application. | 09-16-2010 |
20100241894 | DYNAMIC ADDITION OF REDUNDANT NETWORK IN DISTRIBUTED SYSTEM COMMUNICATIONS - Disclosed is a computer implemented method and apparatus for establishing a redundant channel from an application to a peer data processing system. The interrupt-driven hot standby program receives, through the operation of a data processing system, a communication channel status corresponding to an application. The application has a first channel using local access across a first physical conduit to a first switch. In addition the communication channel status is, in part, an interrupt. The interrupt-driven hot standby program determines whether the redundant channel is present. The redundant channel is configured to use a second physical conduit distinct from the first physical conduit for traffic of the application. Responding to a determination that the redundant channel is present, the interrupt-driven hot standby program determines whether the redundant channel is configured to use the second physical conduit as local access to a redundant switch, wherein the redundant switch is not the first switch. The interrupt-driven hot standby program responds to a determination that the redundant channel is configured to use the second physical conduit by updating a communication channel list to include at least one attribute of the redundant channel, wherein the communication channel list is resident in the data processing system. | 09-23-2010 |
20100241895 | METHOD AND APPARATUS FOR REALIZING APPLICATION HIGH AVAILABILITY - A method and apparatus for realizing application high availability. The application is installed on both a first node and a second node, the first node being used as an active node, and the second node being used as a passive node. The method includes: monitoring access operations to files by an application during its execution on the active node; replicating the monitored updates to the file by the application from the active node to a storage device accessible to the passive node if the application performs updates to a file during the access operations; sniffing the execution of the application on the active node; and switching the active node to the second node and initiating the application on the second node in response to sniffing a failure in the execution of the application on the active node. | 09-23-2010 |
20100241896 | Method and System for Coordinated Multiple Cluster Failover - Hyperclusters are a cluster of clusters. Each cluster has associated with it one or more resource groups, and independent node failures within the clusters are handled by platform specific clustering software. The management of coordinated failovers across dependent or independent resources running on heterogeneous platforms is contemplated. A hypercluster manager running on all of the nodes in a cluster communicates with platform specific clustering software regarding any failure conditions, and utilizing a rule-based decision making system, determines actions to take on the node. A plug-in extends exit points definable in non-hypercluster clustering technologies. The failure notification is passed to other affected resource groups in the hypercluster. | 09-23-2010 |
20100251007 | FILE TRANSPORT IN DYNAMIC ENVIRONMENTS - Systems and methods are disclosed that include transferring data using a client. The client experiences a disruption event and determines the nature of the disruption event. In addition, these systems and methods may include adjusting for the disruption event. The adjustment of the disruption event is performed automatically using at least one processor in the client and resuming the transmitting of data after adjusting for the disruption event. | 09-30-2010 |
20100251008 | Decrypting Load Balancing Array System - A decrypting load balancing array system uses a Pentaflow approach to network traffic management that extends across an array of Decrypting Load Balancing Array (DLBA) servers sitting in front of back end Web servers. One of the DLBA servers acts as a scheduler for the array through which all incoming requests are routed. The scheduler routes and load balances the traffic to the other DLBA servers (including itself) in the array. Each DLBA server routes and load balances the incoming request packets to the appropriate back end Web servers. Responses to the requests from the back end Web servers are sent back to the DLBA server which forwards the response directly to the requesting client. | 09-30-2010 |
20100262860 | LOAD BALANCING AND HIGH AVAILABILITY OF COMPUTE RESOURCES - Compute resources of multiple resource cards are assigned to compute resource pools. Each compute resource pool is typically associated with a specific service (e.g., VoIP, video service, deep packet inspection, etc). Compute resource groups are created in each compute resource pool and are allocated one or more compute resources of that compute resource pool. Those compute resources in a given resource pool that are not allocated to a compute resource group are set as backup compute resources. Upon a failure of a compute resource in a compute resource pool that includes backup compute resources, a backup compute resource is selected and takes over the function of the failed compute resource. Upon a failure of a compute resource in a compute resource group of a compute resource pool that does not include a backup compute resource, the traffic is load balanced across the remaining compute resources of that compute resource group. | 10-14-2010 |
20100281295 | Method for Fast Connectivity Fault Management [CFM] of a Service-Network - This invention is related to a method for Fast Connectivity Fault Management (CFM) of a service-network in the realm of Carrier Ethernet, comprises steps of: learning spanning tree topology of the service-network, exchanging Fast Connectivity Check Messages (Fast-CCM)s between the adjacent service-nodes of the tree, terminating the Fast-CCMs so received, to learn the fault, in the service-network, upon failure to receive a Fast-CCM through a service-port, and pro-actively notifying the fault by service nodes on either side of the faulty service-network. | 11-04-2010 |
20100281296 | FAULT TOLERANT ROUTING IN A NON-HOT-STANDBY CONFIGURATION OF A NETWORK ROUTING SYSTEM - Methods and systems for facilitating fault tolerance in a non-hot-standby configuration of a network routing system are provided. According to one embodiment, a failover method is provided. A fault manager executing on a control blade of multiple server blades of a network routing system actively monitors an active processing engine of multiple processing engines within the network routing system. Responsive to detecting a fault associated with the active processing engine, the active processing engine is dynamically replaced with a non-hot-standby processing engine of the multiple processing engines by (i) determining one or more software contexts that were associated with the active processing engine prior to detection of the fault, and (ii) creating one or more replacement software contexts within the non-hot-standby processing engine corresponding to the one or more software contexts. | 11-04-2010 |
20100287405 | Method and apparatus for internetworking networks - Methods and apparatuses are disclosed for seamlessly combining an access ring aggregation network, e.g., a G.8032 network, and a core network, e.g., a Multi-Protocol Label Switching (MPLS) network. A link status is monitored between an interworking node and at least one peer node in a first network at an interface between the first network and a second network. Connectivity is maintained between the interworking node and the other interworking node(s) via the second network. Communications between the first and second networks are supported via at least one of the interworking nodes. Ring communications are supported among the interworking node, the other interworking node(s), and the peer node(s). End-to-end integration of two disparate networks according to presently disclosed techniques provides network designers and customers with flexibility in designing, operating, and maintaining networks. | 11-11-2010 |
20100287406 | NODE APPARATUS, COMMUNICATION SYSTEM, AND METHOD OF PROVIDING SERVICE - The node apparatus of the present invention is a node apparatus ( | 11-11-2010 |
20100293408 | LINK AGGREGATION PROTECTION - A method includes detecting, by a first network device, a configuration problem at a second network device, where the first and second network devices are associated with a link aggregation group (LAG) coupling the first and second network devices. The method also includes de-activating, by the first network device, one or more links in the LAG in response to detecting the configuration problem. The method further comprises maintaining at least one of the links in the LAG as an active link and allowing traffic to be forwarded on the active link in the LAG. | 11-18-2010 |
20100293409 | REDUNDANT CONFIGURATION MANAGEMENT SYSTEM AND METHOD - Upon receipt of an availability requirement of a computer system under management, a redundant configuration management system determines placement of processing programs in physical servers equipped in the computer system to satisfy the availability requirement with reference to system configuration information indicative of the configuration of the computer system and restriction information for limiting the number of processing programs which can be run by the physical servers. | 11-18-2010 |
20100299551 | MESSAGE PROCESSING METHOD, APPARATUS AND IP COMMUNICATION SYSTEM BASED ON THE SIP PROTOCOL - The present invention provides a message processing method, apparatus based on the SIP Protocol and an IP communication system, wherein the method comprises: a step for processing messages from a core network, for using a SIP proxy server as a uniform access interface for SIP messages from the core network to judge and distribute the received SIP messages to corresponding application servers for processing; a step for processing messages from the application servers, for using the SIP proxy server as a uniform access interface for SIP messages from the application servers to distribute, according to instructions from the application servers, the received SIP messages to corresponding core network device for processing. Thus, with the method and system of the present invention, when the processing capability of one application server is not enough, the processing capability of the application server could be improved by expanding the number of the application servers without any need to perform modification of the configuration by the core network device. | 11-25-2010 |
20100299552 | METHODS, APPARATUS AND COMPUTER READABLE MEDIUM FOR MANAGED ADAPTIVE BIT RATE FOR BANDWIDTH RECLAMATION - A method, apparatus and computer program product for managing content sessions within a network is presented. The systems disclosed herein are able to detect a requirement to modify bandwidth usage within the network either proactively or reactively. In response, example embodiments apply an adaptive bit rate adjustment technique to the content sessions to adjust a data rate associated with each content session according to the requirement to modify bandwidth usage within the network. Example embodiments also then apply a quality of service adjustment technique to the content sessions to adjust a bandwidth allocation assigned between a client and server based upon the adaptive bandwidth adjustment technique. Application of the adaptive bit rate and quality of service adjustment techniques may be policy based. Example embodiments also may monitor a plurality of servers supporting content sessions, detect a failure at a first server and move content sessions to a second server. | 11-25-2010 |
20100299553 | Cache data processing using cache cluster with configurable modes - Processing cache data includes sending a cache processing request to a master cache service node in a cache cluster that includes a plurality of cache service nodes, the cache cluster being configurable in an active cluster configuration mode wherein the plurality of cache service nodes are all in working state and a master cache service node is selected among the plurality of cache service nodes, or in a standby cluster configuration mode, wherein the master cache service node is the only node among the plurality of cache service nodes that is in working state. It further includes waiting for a response from the master cache service node, determining whether the master cache service node has failed; and in the event that the master cache service node has failed, selecting a backup cache service node. | 11-25-2010 |
20100306572 | Apparatus and method to facilitate high availability in secure network transport - Embodiments described herein are effective to detect, repair and recover automatically IPSec tunnels due to failures of transport gear (L2/L3 switches) as well as the IPsec gateway components. Load balance is also an integral part of the approach. When a failure is repaired, the architecture in various embodiments will re-establish load balance and high availability automatically at L2 and L3 and preserve security during the switch-over and recovery process. | 12-02-2010 |
20100306573 | FENCING MANAGEMENT IN CLUSTERS - Apparatus, systems, and methods may operate to detect a failure in a failed one of a plurality of nodes included in a cluster, and to fence a portion of the plurality of nodes, including the failed one. Membership in the portion may be determined according to an aggregated value of weighted values assigned to resources and/or services associated with the cluster. Additional apparatus, systems, and methods are disclosed. | 12-02-2010 |
20100306574 | COMMUNICATION METHOD, COMMUNICATION SYSTEM, NODE, AND PROGRAM - The processing load of a path control message on a node due to a link fault is reduced, a normal routing operation is assured, and the stable continuity of a network is realized. The node having received the path control message transmits the path control message to the adjacent node having transmitted the path control message and at least one or more adjacent nodes on an alternate path. | 12-02-2010 |
20100318835 | BISECTIONAL FAULT DETECTION SYSTEM - An apparatus, program product and method logically divide a group of nodes and causes node pairs comprising a node from each section to communicate. Results from the communications may be analyzed to determine performance characteristics, such as bandwidth and proper connectivity. | 12-16-2010 |
20100318836 | MONITORING AND HEALING A COMPUTING SYSTEM - A method, system, and media for monitoring and healing a computing system. Monitoring agents are deployed to a computing system to be monitored. The monitoring agents collect performance and non-performance counter data of computing devices in the computing system. The data include any recordable information about the monitored computing system. The data is stored in a performance database. A master controller monitors the data for an occurrence of an alert condition that indicates degradation in the health of the computing system. The health of the computing system includes the health of the system, individual components therein, and interactions with other systems. The master controller performs a resolution process to resolve issues causing degradation of the computing system. The collection of data and the monitoring thereof is customizable for an individual computing device, for a cluster of computing devices, or for the computing system as a whole. | 12-16-2010 |
20100318837 | Failure-Model-Driven Repair and Backup - A predictive failure model is used to generate a failure prediction associated with a node. A repair or backup action may also be determined to perform on the node based on the failure prediction. | 12-16-2010 |
20100318838 | METHOD AND COMPUTER SYSTEM FOR FAILOVER - In a computer system wherein plural servers are connected with an external disk device via a network, each server incorporates therein a logic partition module for configuring at least one logic partition in the server, and the operating system stored in the logic partition is booted by the boot disk of an external disk device, the failover operation is performed only for the logic partition affected by a failure when the task being executed by a working server is taken over by another server at the time of the failure occurring in the working server. | 12-16-2010 |
20100325473 | REDUCING RECOVERY TIME FOR BUSINESS ORGANIZATIONS IN CASE OF DISASTERS - An aspect of the present invention reduces the recovery time for business organizations in case of disasters. In one embodiment, a disaster recovery system containing a primary site and a backup site (implemented as a cluster) is maintained. Application instances are executed in both the primary site and the backup site, with the number of instances executed on the backup site being fewer than that executed on the primary site. During normal operation, user requests received are processed using only the instances executing in the primary site, while the instances executing in the backup site are used in a standby state. On identifying that a disaster has occurred, the user requests received immediately after identification of the disaster are processed using only the instances executing in the backup site. The cluster at the backup site is then scaled out to add application instances until a desired level/percentage is achieved. | 12-23-2010 |
20100325474 | SYSTEMS AND METHODS FOR FAILOVER BETWEEN MULTI-CORE APPLIANCES - The present disclosure presents systems and methods for maintaining operation of a first multi-core appliance | 12-23-2010 |
20100325475 | Digital Network Quality Control System Utilizing Feedback Controlled Flexible Waveform Shape for the Carrier Signal - A digital network quality control system and method utilizing feedback controlled flexible waveform shape for the carrier signal is provided. The system and method provides self-analysis and feedback to a variable waveform to increase network reliability and speed by modifying the shape of the waveform itself based on a self analysis of the waveform. | 12-23-2010 |
20100325476 | SYSTEM AND METHOD FOR A DISTRIBUTED OBJECT STORE - An improved system and method for flexible object placement and soft-state indexing of objects in a distributed object store is provided. A distributed object store may be provided by a large number of system nodes operably coupled to a network. A system node provided may include an access module for communicating with a client, an index module for building an index of a replicated data object, a data module for storing a data object on a computer readable medium, and a membership and routing module for detecting the configuration of operable nodes in the distributed system. Upon failure of an index node, the failure may be detected at other nodes, including those nodes that store the replicas of the object. These nodes may then send new index rebuilding requests to a different node that may rebuild the index for servicing any access request to the object. | 12-23-2010 |
20110004781 | PROVISIONING AND REDUNDANCY FOR RFID MIDDLEWARE SERVERS - The present invention provides for the provisioning and redundancy of RFID middleware servers. Middleware servers can be automatically provisioned and RFID device/middleware server associations can be automatically updated. Some implementations of the invention provide for automatic detection of middleware server malfunctions. Some such implementations provide for automated provisioning and automated updating of RFID device/middleware server associations, whether a middleware server is automatically brought online or is manually replaced. Changes and reassignments of the RFID device populations may be accommodated. | 01-06-2011 |
20110004782 | FAULT-TOLERANT COMMUNICATIONS IN ROUTED NETWORKS - A method for providing fault-tolerant network communications between a plurality of nodes for an application, including providing a plurality of initial communications pathways over a plurality of networks coupled between the plurality of nodes, receiving a data packet on a sending node from the application, the sending node being one of the plurality of nodes, the data packet being addressed by the application to an address on one of the plurality of nodes, and selecting a first selected pathway for the data packet from among the plurality of initial communications pathways where the first selected pathway is a preferred pathway. | 01-06-2011 |
20110004783 | FAULT-TOLERANT COMMUNICATIONS IN ROUTED NETWORKS - A method for providing fault-tolerant network communications between a plurality of nodes for an application, including providing a plurality of initial communications pathways over a plurality of networks coupled between the plurality of nodes, receiving a data packet on a sending node from the application, the sending node being one of the plurality of nodes, the data packet being addressed by the application to an address on one of the plurality of nodes, and selecting a first selected pathway for the data packet from among the plurality of initial communications pathways where the first selected pathway is a preferred pathway. | 01-06-2011 |
20110010578 | CONSISTENT AND FAULT TOLERANT DISTRIBUTED HASH TABLE (DHT) OVERLAY NETWORK - A peer-to-peer (P2P) system is described herein which has a distributed hash table (DHT) overlay network containing multiple DHT nodes each of which has a complete distributed DHT hash table which contains information identifying a specific range of hashes for each of the DHT nodes such that when anyone of the DHT nodes receives a query asking for a specific key then the queried DHT node interacts with their respective DHT table to determine which one of the DHT nodes is storing the specific key and to forward ‘the query in one network hop to the particular DHT node which is storing the specific key. The P2P system can also implement one or more data-related mechanisms including a bootstrap mechanism, a replication mechanism, an update mechanism and a recover mechanism which enable fault-tolerant DHT nodes. | 01-13-2011 |
20110010579 | Method for detecting errors and supporting reconfiguration decisions in mobile radio networks comprising reconfigurable terminals, and corresponding network elements and components - A respective agent platform in network elements and producer-specific agents that are installed directly on platform or by way of agent proxies of agent providers. The agents then receive raw information on arising operating errors by way of a defined interface of the agent platform, and, together with producer-specific information on the respective terminals or types of terminal that are known only to the respective producer, from corresponding compressed decision information for evaluating cases of error and/or for optimizing reconfiguration decisions. The agents then provide the information for the network element or the network operator and/or the agent provider or the terminal producer, via the defined interface. This leads to a higher reliability of the interoperability of terminals and network elements in mobile radio networks including reconfigurable terminals. | 01-13-2011 |
20110016349 | REPLICATION IN A NETWORK ENVIRONMENT - A method for server replication in a network environment is provided. The primary server provides a first service to a client. If the first service involves interaction with a non-deterministic data source, the primary server performs the interaction and provides information about the interaction to a secondary server that is a replica of the primary server. The secondary server uses the information about the interaction to synchronize the secondary server with the primary server. | 01-20-2011 |
20110022882 | Dynamic Updating of Failover Policies for Increased Application Availability - Mechanisms are provided for performing a failover operation of an application from a faulty node of a high availability cluster to a selected target node. The mechanisms receive a notification of an imminent failure of the faulty node. The mechanisms further receive health information from nodes of a local failover scope of a failover policy associated with the faulty node. Moreover, the mechanisms dynamically modify the failover policy based on the health information from the nodes of the local failover scope and select a node from the modified failover policy as a target node for failover of an application running on the faulty node to the target node. Additionally, the mechanisms perform failover of the application to the target node based on the selection of the node from the modified failover policy. | 01-27-2011 |
20110022883 | Method for Voting with Secret Shares in a Distributed System - A replicated decentralized storage system comprises a plurality of servers that locally store disk images for locally running virtual machines as well as disk images, for failover purposes, for remotely running virtual machines. To ensure that disk images stored for failover purposes are properly replicated upon an update of the disk image on the server running the virtual machine, a hash of a unique value known only to the server running the virtual machine is used to verify the origin of update operations that have been transmitted by the server to the other servers storing replications of the disk image for failover purposes. If verified, the update operations are added to such failover disk images. To enable the replicated decentralized system to recover from a failure of the primary server, the master secret is subdivided into parts and distributed to other servers in the cluster. Upon a failure of the primary server, a secondary server receives a threshold number of the parts and is able to recreate the master secret and failover virtual machines that were running in the failed primary server. | 01-27-2011 |
20110035620 | Virtual Machine Infrastructure With Storage Domain Monitoring - A computing device monitors multiple hosts. A first host that does not have access to a data store is identified. A determination is made as to whether other hosts have access to the data store. When the other hosts do have access to the data store, it is determined that the first host is malfunctioning. A host error notification may then be sent to an administrator. | 02-10-2011 |
20110041002 | SYSTEM, METHOD, COMPUTER PROGRAM FOR MULTIDIRECTIONAL PATHWAY SELECTION - The present invention is directed at determining a network pathway between two end points within the network. The network pathway to be used can be determined based on parameters monitored or measured at either or both of the two end points. The present invention is also directed to managing the network pathway for supporting communication between the two end points at any given time. These communications may occur in two directions, therefore the pathway and management of this pathway can happen in two directions. In the present invention the pathway selection operates on a bi-directional basis (further explained below). The invention provides a solution for “bi-directional network pathway management”. | 02-17-2011 |
20110047405 | System and Method for Implementing an Intelligent Backup Technique for Cluster Resources - Method and system for implementing a backup in a cluster comprising a plurality of interconnected nodes, at least one of the nodes comprising a cluster resource manager (CRM), and at least one of the nodes comprising a policy engine (PE), the PE maintaining at least one dependency associated with at least a first resource executing on at least one of the nodes. For example, the method comprises, receiving by the CRM a backup request for the first resource from an administrator; responsive to the request, updating by the CRM the cluster configuration; communicating by the CRM to the PE a cluster status and the updated configuration; providing by the PE to the CRM an instruction sequence for carrying out the backup, the instruction sequence based on the dependency associated with the first resource; and responsive to the instruction sequence, carrying out by the CRM the backup of the first resource. | 02-24-2011 |
20110047406 | SYSTEMS AND METHODS FOR SENDING, RECEIVING AND MANAGING ELECTRONIC MESSAGES - Systems and methods for electronic message may include receiving information from a sending device. The information may include a prioritization determination, initial receiving device destinations, alternate receiving device destinations, and a determination of a plurality of responses to message delivery failures from the sending device. Data may be attached to the message before encrypting and sending. Status information about the message may be received, provided to the sending device, and updated. If message delivery failure occurs, the message may be sent to one or more alternate receiving device destinations. Status information about the message may again be received, provided to the sending device, and updated. Message and status information may be stored in a database. | 02-24-2011 |
20110055621 | DATA REPLICATION BASED ON CAPACITY OPTIMIZATION - A system and associated method for replicating data based on capacity optimization. A local node receives the data associated with a key. The local node within a local domain communicates with nodes of remote domains in a system through a communication network. Each domain has its own distributed hash table that partitions key space and assigns a certain key range to an owner node within the domain. For new data, the local node queries owner nodes of domains in the system progressively from the local domain to remote domains for a duplicate of the new data. Depending on a result returned by owner nodes and factors for replication strategies, the local node determines a replication strategy and records the new data in the local node pursuant to the replication strategy. | 03-03-2011 |
20110055622 | NETWORK SYSTEM AND NETWORK RELAY APPARATUS - The network system is provided. The network system includes: a first processing apparatus configured to provide a specific service; a second processing apparatus configured to provide the specific service, the first processing apparatus and the second processing apparatus having one identical address; a client apparatus configured to utilize the specific service; and a network relay apparatus connected directly or indirectly via interfaces to the first processing apparatus, the second processing apparatus, and the client apparatus and configured to relay packet transmission between the client apparatus and the first processing apparatus or the second processing apparatus, wherein the network relay apparatus forwards a received packet, which is received via the interface connecting with the client apparatus to be sent to the address as a destination, to one processing apparatus in a state enabled to provide the specific service between the first processing apparatus and the second processing apparatus. | 03-03-2011 |