Performance monitoring for fault avoidance

Subclass of:

714 - Error detection/correction and fault detection/recovery

714100000 - DATA PROCESSING SYSTEM ERROR OR FAULT HANDLING

714001000 - Reliability and availability

Patent class list (only not empty are listed)

Deeper subclasses:

Document	Title	Date
Entries
20080215927	Method of Monitoring the Correct Operation of a Computer - The present invention relates to computers executing in time-share mode, under the control of their operating systems, a number of separate and independent application programs. The present invention relates in particular to the networks of onboard computer networks of IMA type executing application programs written independently of the hardware specifications of the computers and not permanently resident in the computers. The method of the present invention associates with the digital core of each computer of the network a monitoring state machine operating independently and in having the monitoring state machine monitor the correct observance by the associated computer of the time sequencing of the tasks and memory partition allocations. Furthermore, the monitoring state machines can be configured to execute monitoring service applications of time-out or watchdog type to which the application programs executed by the computers of the network can subscribe.	09-04-2008
20080222457	ELECTRONIC DATA PROCESSING SYSTEM AND METHOD FOR MONITORING THE FUNCTIONALITY THEREOF - A method for monitoring of the functionality of an EDP system that is monitored in portions thereof by respectively associated agents that are designed to evaluate errors and to send error messages should increase the operating security in an EDP system. Each agent is monitored by a simulated error being sent to the agent and the reaction of the agent being evaluated.	09-11-2008
20080235538	Techniques for generating a trace stream for a data processing apparatus - A data processing apparatus and method are provided for generating a trace stream. The data processing apparatus comprises logic for producing data elements, and trace logic for producing a stream of trace elements representative of at least some of the data elements. The trace logic has trace generation logic operable to generate trace elements for inclusion in the stream, and is further arranged to generate trace timing indicators for inclusion in the stream. Each trace timing indicator indicates the elapse of one or more processing timing intervals, the processing timing interval being a predetermined plurality of clock cycles.	09-25-2008
20080244330	APPARATUS, SYSTEM AND COMPUTER PROGRAM PRODUCT FOR SEAMLESSLY INTEGRATING THERMAL EVENT INFORMATION DATA WITH PERFORMANCE MONITOR DATA - An apparatus, system and method of integrating performance monitor data with thermal event information are provided. A thermal event, in this case, is when the temperature of a chip within which is embedded a processor exceeds a user-configurable value while the processor is processing instructions and/or using storage devices that are being monitored. In any event, when the thermal event occurs, the temperature of the chip along with the performance monitor data is stored for future uses, which include performance and diagnostic analyses.	10-02-2008
20080250276	Memory growth detection - A process for monitoring memory growth of a software application includes measuring a baseline memory usage for processes operating on a computer system. A background memory logging script is then executed to log process memory usage. At some later time, a trending script is executed which outputs a list showing how memory usage has changed for each process operating on the system.	10-09-2008
20080256397	System and Method for Network Performance Monitoring and Predictive Failure Analysis - A method and system for detecting performance degradation of a plurality of monitored components in a networked storage system. Performance data is collected from the plurality of monitored components. Component statistics are generated from the collected performance data. Heuristics are applied to the generated component statistics to determine the likelihood of failure or degradation of each of the plurality of monitored components.	10-16-2008
20080256398	Using EMI signals to facilitate proactive fault monitoring in computer systems - A system that monitors electromagnetic interference (EMI) signals to facilitate proactive fault monitoring in a computer system is presented. During operation, the system receives EMI signals from one or more antennas located in close proximity to the computer system. The system then analyzes the received signals to proactively detect anomalies during operation of the computer system.	10-16-2008
20080256399	SOFTWARE EVENT RECORDING AND ANALYSIS SYSTEM AND METHOD OF USE THEREOF - A software service running in the background of an operating system and used by a user to record metadata and screen shots of the user interface screens in an operating system whenever errors occur in the operating system or in any application running on the operating system. The software service also manages the recorded data to ensure resources are used efficiently to minimize the use of storage space in the recording location or buffer. The software running in the background monitors, filters and logs programs and user actions, and if a problem occurs within the monitored software, a problem report can be created for a support team to analyze and include corresponding recorded data. The suggested selection of recorded data can be displayed and edited by the user.	10-16-2008
20080263410	METHOD AND APPARATUS FOR TESTING OF ENTERPRISE SYSTEMS - In a method of virtual user compensation by a test run system of an enterprise system, the termination of a virtual user is identified and a new virtual user is created to compensate for the terminated virtual user. The new virtual user is then assigned to the enterprise system. Rules associated with the conditions of virtual user termination indicate how to compensate for the terminated virtual users.	10-23-2008
20080263411	Test Instrument and System Responsive to Execution Time Data - Test instruments constituting an automatic test system are characterized in terms of execution time data. The execution time data is composed of a set of execution times. Each of the execution times is the time required for the test instrument to perform a respective testing operation. The test instruments additionally have the ability to communicate their respective execution time data to such recipients as others of the test instruments, the system controller and recipients outside the automatic test system. Additionally, such test instruments have the ability to communicate test results to at least one other of the test instruments and the ability to process test results received from at least one other of the test instruments. Such characterization, communication and processing allows a system integrator to devise execution time-dependent test programs as part of a test suite that allows test throughput to be maximized.	10-23-2008
20080270851	METHOD AND SYSTEM FOR MANAGING APPARATUS PERFORMANCE - The method comprises and executes constitutional information collection processing of collecting constitutional information of the apparatus, constitutional information of a logical unit which is a logical existence obtained by abstracting the apparatus, constitutional information of the application and constitutional information of the dependency relation of the performance established among the apparatus, the logical unit and the application; performance information collection processing of collecting each performance information of the apparatus, the logical unit and the application; and saturation indication detection processing of analyzing a correlation between a change value with time of the performance information of the apparatus and a change value with time of the performance information of the logical unit having the dependency relation of the performance with respect to the apparatus for a predetermined period, and detecting that the apparatus has the saturation indication, when a correlation coefficient obtained by the correlation analysis is a predetermined threshold value or more.	10-30-2008
20080276130	DISTRIBUTED LOGGING APPARATUS SYSTEM AND METHOD - An apparatus, system, and method are disclosed for distributed logging. Operating entities and associations between operating entities are registered in a registry by a logging entity registrar. An event notification monitor recognizes operating errors in operating entities. An aggregation module aggregates operating logs from sets of associated entities, which are then stored by a log set recorder.	11-06-2008
20080276131	SYSTEMS AND METHODS FOR EVENT DETECTION - A system accesses a log of events on more than one computing system and scans these logs in an effort to determine the likely cause of various items of interest, events, or problems. These items of interest often include improper or frustrating behavior of a computer system, but may also include delightful or beneficial behaviors for which a user, group of users, company, service, or help desk seeks a cause. Once the likely source of the item of interest is found, a test may be performed to confirm the source of the problem and warning or corrective action taken.	11-06-2008
20080282114	METHOD TO ENHANCE MICRO-C4 RELIABILITY BY REDUCING THE IMPACT OF HOT SPOT PULSING - A system for reducing an impact of hot spot, pulsing of a semiconductor device including: first generating means for generating a plurality of local op-codes; a sequencer for augmenting customer op-codes with the plurality of local op-codes; selecting means for selecting one or more of the randomly arriving customer op-codes awaiting execution; monitoring means for tracking which of the one or more randomly arriving customer op-codes have been selected; separating means for separating the plurality of local op-codes from the one or more customer op-codes; storing means for storing one or more data related to the processing of the plurality of local op-codes and the customer op-codes; and second generating means for generating an output for a customer corresponding to that customer op-code while gainfully employing an output generated by local op-codes for system health monitoring purpose.	11-13-2008
20080282115	CLIENT-SERVER TEXT MESSAGING MONITORING FOR REMOTE COMPUTER MANAGEMENT - Implementation of a client-server text messaging (CSTM) monitor installed on a computer system that is configured to monitor a client-server text messaging (CSTM) server for commands posted thereto, and a management program installed on the computer system which is responsive to the commands. The CSTM monitor is lightweight and allows multiple computer systems to monitor a CSTM server and execute posted commands. Managed computer systems are more efficient because the management program does not run continuously. The commands are text-based and, therefore, require very little network bandwidth between a management system and the managed computer system. The invention allows a centralized computer management system to monitor managed computer systems and implement corrective measures without overburdening the systems or network bandwidth.	11-13-2008
20080288827	METHOD FOR VISUALIZING RESULTS OF ROOT CAUSE ANALYSIS ON TRANSACTION PERFORMANCE DATA - Mechanisms for graph manipulation of transactional performance data are provided in order to identify and emphasize root causes of electronic business system transaction processing performance problems. A system transaction monitoring system, such as IBM Tivoli Monitoring for Transaction Performance™ (ITMTP) system, is utilized to obtain transaction performance data for a system. This transaction performance data is stored in a database and is utilized to present a graph of a given transaction or transactions. Having generated a graph of the transaction, and having identified problem conditions in the processing of the transaction(s), the present invention provides mechanisms for performing graph manipulation operations to best depict the root cause of the problems.	11-20-2008
20080288828	STRUCTURES FOR INTERRUPT MANAGEMENT IN A PROCESSING ENVIRONMENT - A design structure embodied in a machine readable storage medium for designing, manufacturing, and/or testing a design for managing interrupts in a processing system are disclosed. The design structure can determine an indication of an interrupt request from a peripheral entity, identify the peripheral entity associated with the indication, count occurrences of the indications; and flag the peripheral entity in response to the counted occurrences. When the counted occurrences reach a predetermined number in the predetermined time interval, interrupts from the peripheral entity can be ignored or the entity can be identified as having possible operational problems.	11-20-2008
20080294944	PROCESSOR BUS FOR PERFORMANCE MONITORING WITH DIGESTS - A method for monitoring event occurrences from a plurality of processor units at a centralized location via a dedicated bus coupled between the plurality of processor units and the centralized location. In particular, the method comprises receiving, at the centralized location, data indicative of cumulative events occurring at one of the processor units, and storing the data in a first temporary memory. The data is then stored in a register based on a tag identifier affixed to the data in an instance where the tag identifier provides indicia of one of the plurality of processor units.	11-27-2008
20080301505	COMPUTER PERFORMANCE MONITORING METHOD AND SYSTEM - A monitoring method and system. The method includes receiving by a software application within a computing system, data comprising a first data point associated with an operating parameter for a characteristic associated with the computing system. The software application converts the data point into a mathematical value and associates the mathematical value with a scaled value. The software application associates the scaled value with a first timbre and a harmonic interval and generates a first musical note value from the scaled value. The first musical note value is transmitted to an amplifier device within the computing system. The amplifier device generates a first audible musical note from the first musical note value and presents the first audible musical note to a user of the computing system.	12-04-2008
20080307269	Resolution of Computer Operations Problems Using Fault Trend Analysis - A set of fault records representing faults previously detected in an enterprise computer system is received and analyzed. The analysis comprises a variety of analytical operations and results in a report provided to a user, the report particularly including a set of fault sources identified as highly important to address, with respect both to the system as a whole and to particular categories of faults.	12-11-2008
20080307270	EMERGING BAD BLOCK DETECTION - Apparatus and methods, such as those that read data from non-volatile integrated circuit memory devices, such as NAND flash. For example, disclosed techniques can be embodied in a device driver of an operating system. Errors are tracked during read operations. If sufficient errors are observed during read operations, the block is then retired when it is requested to be erased or a page of the block is to be written. One embodiment is a technique to recover data from uncorrectable errors. For example, a read mode can be changed to a more reliable read mode to attempt to recover data. One embodiment further returns data from the memory device regardless of whether the data was correctable by decoding of error correction code data or not.	12-11-2008
20080307271	COMPUTER SYSTEM OR PERFORMANCE MANAGEMENT METHOD OF COMPUTER SYSTEM - This invention provides a system including a computer and a storage-subsystem comprising at least either a first storage area for storing data sent from the computer or a second storage area to be associated with the first storage area, for storing replicated data of data stored in the first storage area. This system includes a replication processing status referral unit for referring to a replication processing status of data of the first storage area and the second storage area to be associated, and an output unit for outputting first performance information concerning data I/O stored in the first storage area, and outputting second performance information concerning data I/O stored in the second storage area together with the first performance information when the replicated data is being subject to replication processing from the first storage area to the second storage area as a result of referring to the replication processing status.	12-11-2008
20080313505	FLASH MEMORY WEAR-LEVELING - A memory system and corresponding method of wear-leveling are provided, the system including a controller, a random access memory in signal communication with the controller, and another memory in signal communication with the controller, the other memory comprising a plurality of groups, each group comprising a plurality of first erase units or blocks and a plurality of second blocks, wherein the controller exchanges a first block from a group with a second block in response to at least one block erase count within the group; and the method including receiving a command having a logical address, converting the logical address into a logical block number, determining a group number for a group that includes the converted logical block number, and checking whether group information comprising block erase counts for the group is loaded into random access memory, and if not, loading the group information into random access memory.	12-18-2008
20080320339	Method of remotely monitoring an internet web site - A method of performing a service which remotely monitors a Web site includes the steps of monitoring the site for an error and notifying a site representative in the event an error is detected on the site. Advance permission is not obtained prior to sending the notification and a fee is not charged for the service. The appropriate e-mail address to which the notification is sent is identified based on one or more categories and a priority assigned to all e-mail addresses identified on the monitored site. The notification may be sent, alternatively, to the representative of a site linked to the site monitored or to some other interested third party. Subscribers to the monitoring service may be enrolled automatically upon submission of their site to a search engine service or to a domain name registry. The list of service recipients generated by the monitoring service is usable for other commercial purposes.	12-25-2008
20090006901	Control Systems and Method Using a Shared Component Actuator - In one embodiment, a control system supports an unlimited number of feedback control loops all sharing control of a component. A component performance rate or “speed” is used as a common metric for negotiating control of the component. Each control loop continuously monitors a system parameter it is tasked with regulating, compares it to a setpoint for that system parameter, and “requests” a speed in relation to the deviation of the associated system parameter from the corresponding setpoint. A controller receives the requested speeds as dynamic inputs and selects one of the requested speeds according to predefined selection logic. The controller communicates the selected speed to an actuator, which causes the component to operate at the selected speed. In this manner, the control system in effect negotiates control of the component in a way that ensures that all of the system parameters are being managed within safe limits.	01-01-2009
20090013217	MULTICORE ABNORMALITY MONITORING DEVICE - A monitoring side core has an input protection part including an access checking part and an address information storage part. Address information of a count RAM area and an access prohibiting mode to the address are stored in the address information storage part in advance by CPU. The access checking part determines whether an address to be accessed through a first communication path by a monitored side core and an access mode are coincident with the stored address and the stored access prohibiting mode. When the coincidence is determined, the access of the monitored side core to the count RAM area of the monitoring side core is prohibited.	01-08-2009
20090019316	METHOD AND SYSTEM FOR CALCULATING AND DISPLAYING RISK - A system for calculating and rendering a risk level. In response to receiving an input to perform an action within a data processing system, a level of risk to the data processing system to perform the action is calculated based on a set of rules. It is determined whether the calculated level of risk presents an elevated risk. In response to determining that the calculated level of risk does present the elevated risk, a user interface is rendered with an appropriate elevated visual warning based on the calculated level of risk.	01-15-2009
20090019317	MECHANISM FOR IDENTIFYING THE SOURCE OF PERFORMANCE LOSS IN A MICROPROCESSOR - A system and method of accounting for lost clock cycles in a microprocessor. A method includes detecting a first reason which prevents exit of an entry from an instruction retirement queue, and incrementing a first count corresponding to the first reason, wherein the first count is incremented while the first reason prevents exit of the entry from the queue. A first point in time is determined when said first reason no longer prevents exit of the entry from the queue. A second reason which prevents exit of the entry from the queue is detected, wherein the second reason came into existence prior to said first point in time. A second count corresponding to the second reason is incremented, wherein incrementing the second count begins at the first point in time.	01-15-2009
20090019318	Approach for monitoring activity in production systems - An approach is provided for monitoring of the activity in production computer systems. During a first period of time, substantially all of a first plurality of dispatches sent to a CPU are recorded. Each dispatch of the first plurality of dispatches indicates an initial instruction of a stream of instructions that is executed without interruption by the CPU. Based on the first plurality of dispatches, a baseline profile that indicates a normal execution flow in the system is generated. During a second period of time, substantially all of a second plurality of dispatches sent to the CPU are monitored. Based on the baseline profile and on at least one of the second plurality of dispatches, a determination is made whether an abnormal execution flow exists in the system during the second period of time. One or more actions are performed in response to determining that the abnormal execution flow exists in the system during the second period of time.	01-15-2009
20090019319	REMOTE MONITORING DIAGNOSTIC SYSTEM - Disclosed is a remote monitoring diagnostic system in which a center and monitoring diagnostic units of a number of objects to be monitored are connected by a network. The center includes an algorithm forming unit for forming algorithms for monitoring, diagnosing, and operating each object to be monitored, a program group formation unit for forming monitoring, diagnostic, and operational programs from these algorithms, a transmitter for transmitting the programs in response to a request from the monitoring diagnostic unit, and a unit for forming information concerning prevention/maintenance form a diagnostic result and monitoring data from the monitoring diagnostic unit of each object to be monitored. The monitoring diagnostic unit of each object to be monitored includes a mobile program execution processor for executing the corresponding object to be monitored, and a transmitter for transmitting monitoring data to the center.	01-15-2009
20090031174	SERVER OUTAGE DATA MANAGEMENT - Server outage data is automatically created and managed. Outage data is automatically retrieved from one or more servers at which an outage is detected by an agent installed on the server. The agent may search for outage event data and transmit the data to a monitoring application. The monitoring application receives the event data and creates an outage record from the data. Server contents, such as the number of users having account data on the server, can be determined either before or after the outage has occurred. Once the outage data and server contents are known, the cost and impact of the outage for each particular server can be determined. The cost of a server outage may be determined based on the outage record and the server data identifying resources of the server, such as user account data.	01-29-2009
20090031175	SYSTEM AND METHOD FOR ANALYZING STREAMS AND COUNTING STREAM ITEMS ON MULTI-CORE PROCESSORS - Systems and methods for parallel stream item counting are disclosed. A data stream is partitioned into portions and the portions are assigned to a plurality of processing cores. A sequential kernel is executed at each processing core to compute a local count for items in an assigned portion of the data stream for that processing core. The counts are aggregated for all the processing cores to determine a final count for the items in the data stream. A frequency-aware counting method (FCM) for data streams includes dynamically capturing relative frequency phases of items from a data stream and placing the items in a sketch structure using a plurality of hash functions where a number of hash functions is based on the frequency phase of the item. A zero-frequency table is provided to reduce errors due to absent items.	01-29-2009
20090031176	Anomaly detection - A system such as a Web-based system in which a plurality of computers interact with each other is monitored to detect online an anomaly. Transactions of a service provided by each of a plurality of computers to another computer are collected, a matrix of correlations between nodes in the system is calculated from the transactions, and a feature vector representing anode activity balance is obtained from the matrix. The feature vector is monitored using a probability model to detect a transition to an anomalous state.	01-29-2009
20090037777	USE OF OPERATIONAL CONFIGURATION PARAMETERS TO PREDICT SYSTEM FAILURES - The use of operational configuration parameters to predict digital system failures is described herein. At least some illustrative embodiments include a method that includes initializing a digital system (the initializing comprising determining an operational configuration of at least part of the digital system), saving the operational configuration to a database stored on the digital system, reading the operational configuration from the database and comparing the operational configuration to a reference configuration, and identifying the digital system as being at risk of a future failure if at least one parameter of the operational configuration differs from the at least one same parameter of the reference configuration by more than a tolerance value.	02-05-2009
20090044060	Method for Supervising Task-Based Data Processing - The invention relates to a method for supervising a task-based data processing, wherein for a plurality of tasks the following steps are performed for each task: scheduling the task for processing, and logging the scheduling of the task by storing a task identifier in a log memory, said task identifier identifying the scheduled task and being assigned to the scheduled task. The task identifiers stored in the log memory form a task history pattern of scheduled tasks. By means of the task history a pattern may be detected for determining whether a failure appears in the task-based data processing. At least one safety measure is taken when a failure is detected.	02-12-2009
20090055689	SYSTEMS, METHODS, AND COMPUTER PRODUCTS FOR COORDINATED DISASTER RECOVERY - Systems, methods and computer products for coordinated disaster recovery of at least one computing cluster site are disclosed. According to exemplary embodiments, a disaster recovery system may include a computer processor and a disaster recovery process residing on the computer processor. The disaster recovery process may have instructions to monitor at least one computing cluster site, communicate monitoring events regarding the at least one computing cluster site with a second computing cluster site, generate alerts responsive to the monitoring events on the second computing cluster site regarding potential disasters, and coordinate recovery of the at least one computing cluster site onto the second computing cluster site in the event of a disaster.	02-26-2009
20090063906	Method, Apparatus and Program Storage Device for Extending Dispersion Frame Technique Behavior Using Dynamic Rule Sets - A method, apparatus and program storage device for providing control of statistical processing of error data over a multitude of sources using a dynamically modifiable DFT rule set is disclosed. The dispersion frame technique is extended in the present invention to provide dispersion frame rules with user-defined parameters thereby creating a dynamically modifiable rule set.	03-05-2009
20090070635	METHOD OF IMPROVING THE INTEGRITY AND SAFETY OF AN AVIONICS SYSTEM - The present invention relates to a method of improving the integrity and safety of a system, this method making it possible, on the one hand, to detect and to locate an anomaly of a system, and on the other hand to estimate the impact of such an anomaly on the degradation of performance, with a view to attaining the safety level required and to making the data provided by this system safe, and this method is characterized in that it consists, in a system comprising sub-assemblies, in monitoring the proper operation of sub-assemblies by checking their respective transfer functions in the operational mode with the aid of stimuli dispatched to these sub-assemblies.	03-12-2009
20090083586	FAILURE MANAGEMENT DEVICE AND METHOD - A method and device for monitoring failures are disclosed herein. The monitoring device comprises an environmental event generator and a status-monitor. The environmental event generator generates an environmental trigger based on changes in an environmental factor, and the status-monitor monitors failure-status of plurality of elements in a system. The status-monitor is configured to change its operational profile based on the environmental trigger. The device further comprises a decision mechanism configured to link the changes in an environmental factor and associated failure modes to different service industries.	03-26-2009
20090083587	Apparatus and method for selectively enabling and disabling a squelch circuit across AHCI and SATA power states - An apparatus and a method are provided for selectively enabling and disabling a squelch circuit in a Serial Advanced Technology Attachment (SATA) host or SATA device while maintaining proper operation of the host and device. An apparatus and method are provided which allow the squelch circuit to be selectively enabled and disabled across SATA power states (PHY Ready, Partial, and Slumber) and in Advanced Host Controller Interface (AHCI) Listen mode.	03-26-2009
20090083588	DEVICE REMOTE MONITOR/RECOVERY SYSTEM - A system includes a support side host (PCh	03-26-2009
20090094488	System and Method for Monitoring Application Availability - A system and method for monitoring the availability of an application in a distributed data processing environment are provided. The performance aspects of application availability are defined in terms of easily observed and computed characteristics of the application as it behaves in a deployed environment with the deployed configuration. The system and method observe the application processes, the structural resources they require, and the consumable resources they require from the running system itself. These observations are then used to derive minimum requirements for the resource requirement aspects of availability as well as derive criteria for normal behavioral conditions. These minimum requirements and normal behavioral conditions are then used to establish monitoring rules or conditions for monitoring the operation of the application to determine if availability of the application is degrading such that a notification needs to be sent to an administrator.	04-09-2009
20090106605	HEALTH MONITOR - Techniques for proactively and reactively running diagnostic functions. These diagnostic functions help to improve diagnostics of conditions detected in a monitored system and to limit/quarantine the damages caused by the detected conditions. In one embodiment, a health monitor infrastructure is provided that is configured to perform one or more health checks in a monitored system for diagnosing and/or gathering information related to the system. The one or more health checks may be invoked pro-actively on a scheduled basis, reactively in response to a condition detected in the system, or may even be invoked manually by a user such as a system administrator.	04-23-2009
20090119549	METHOD FOR COUNTING INSTRUCTIONS FOR LOGGING AND REPLAY OF A DETERMINISTIC SEQUENCE OF EVENTS - This invention relates to a transparent and non-intrusive method for monitoring and managing the running of tasks executed in one or more computer processors, in particular in multi-processor systems with a parallel architecture. It proposes a system and method for managing a computer task, termed target, during a given execution period, termed activity period (SchJ, SchR), within a computer system, in a computer processor provided with means of monitoring or estimating performance and including a counter (PMC) with a given possible error in plus or minus, termed relative error, this process comprising	05-07-2009
20090119550	IMAGE FORMING APPARATUS AND ANALYSIS METHOD - A sufficient number of packets necessary for analysis of a fault in a network communication apparatus are obtained. A multi function peripheral (MFP) temporarily stores received packets as a file for every print job, and stores communication failure information as a log (communication failure log). The MFP deletes data in which no error has occurred in an application among the stored files. Then, in a case where an error has occurred during processing of a certain print job, the MFP stores received packets in a storage device, and compares a communication failure in the job packet in which the error has occurred with a communication failure in packets associated with all the received print jobs. As a result of the comparison, the MFP extracts a communication failure in only the job packet in which the error has occurred, and creates a log so that the extracted result can be identified.	05-07-2009
20090125757	Context-Related Troubleshooting - A system and method provide system monitoring and detailed troubleshooting workflow guidance. Individual system components may be monitored and faulty system components identified. Subsequently, component-related, context-related, and/or user-specific troubleshooting workflow guidance associated with the faulty system component identified may be retrieved and/or determined by a processor and presented. The component-related guidance may be based upon which system component is faulty. The context-related guidance may be based upon the type of error with the faulty system component. The user-specific guidance may be based upon information specific to the user, such as user specifications, preferences, and configurations. The troubleshooting guidance may provide virtual guidance related to the workflow steps to be performed to identify the error with the faulty component and then correct the error identified. Accordingly, consistent, reproducible, and efficient troubleshooting may be performed using the troubleshooting workflow guidance presented. In one embodiment, the system monitored is a medical imaging system.	05-14-2009
20090132864	CLUSTERING PROCESS FOR SOFTWARE SERVER FAILURE PREDICTION - Embodiments of the present invention allow the prevention and/or mitigation of damage caused by server failure by predicting future failures based on historic failures. Statistical data for server parameters may be collected for a period of time immediately preceding a historic server failure. The data may be clustered to identify cluster profiles indicating strong pre-fault clustering patterns. Real time statistics collected during normal operation of the server may be applied to the cluster profiles to determine whether real time statistics show pre-fault clustering. If such a pattern is detected, measures to prevent or mitigate server failure may be initiated.	05-21-2009
20090150726	METHOD AND SYSTEM FOR EXTENDING THE USEFUL LIFE OF ANOTHER SYSTEM - Disclosed are embodiments of a method and an associated first system for extending product life of a second system in the presence of phenomena that cause the exhibition of both performance degradation and recovery properties within system devices. The first system includes duplicate devices incorporated into the second system (e.g., on a shared bus). These duplicate devices are adapted to independently perform the same function within that second system. Reference signal generators, a reference signal comparator, a power controller and a state machine, working in combination, can be adapted to seamlessly switch performance of that same function within the second system between the duplicate devices based on a measurement of performance degradation to allow for device recovery. A predetermined policy accessible by the state machine dictates when and whether or not to initiate a switch.	06-11-2009
20090172477	REMOTE MONITORING SYSTEM, TERMINAL MANAGEMENT SERVER AND TERMINAL MANAGEMENT SERVER CONTROL PROGRAM - An insulation monitoring system, functioning as a remote monitoring system, comprises a plurality of insulation monitoring terminals, functioning as remote monitoring terminals, for monitoring facilities and a terminal management server controlling the insulation monitoring terminals. The insulation monitoring terminals and terminal management server are connected each other to bidirectionally transmit and receive information therebetween. The terminal management server includes collective input unit for accepting collectively input configuration information pieces to be set in the insulation monitoring terminals, storage unit for storing the plurality of configuration information pieces input by the collective input unit, and distribution unit for distributing the plurality of configuration information pieces stored in the storage unit to the insulation monitoring terminals respectively associated with the configuration information pieces.	07-02-2009
20090177929	METHOD AND APPARATUS FOR ADAPTIVE DECLARATIVE MONITORING - A method of and apparatus for monitoring a computer system includes defining a monitoring policy for the computer system. At least one computer is employed to determine a status of a state of the computer system relative to the monitoring policy. At least one computer is employed to determine a condition of at least one monitored element to be monitored in the computer system based on the status of the state of the computer system. Furthermore, at least one computer is employed to monitor the condition of the at least one monitored element in the computer system, based on the monitoring policy. At least one computer is employed to perform an action in response to the condition assuming a predetermined status.	07-09-2009
20090199047	EXECUTING SOFTWARE PERFORMANCE TEST JOBS IN A CLUSTERED SYSTEM - Using a testing framework, developers may create a test module to centralize resources and results for a software test plan amongst a plurality of systems. With assistance from the testing framework, the test module may facilitate the creation of test cases, the execution of a test job for each test case, the collection of performance statistics during each test job, and the aggregation of collected statistics into organized reports for easier analysis. The test module may track test results for easy comparison of performance metrics in response to various conditions and environments over the history of the development process. The testing framework may also schedule a test job for execution when the various systems and resources required by the test job are free. The testing framework may be operating system independent, so that a single test job may test software concurrently on a variety of systems.	08-06-2009
20090204853	INTERFACE FOR ENABLING A HOST COMPUTER TO RETRIEVE DEVICE MONITOR DATA FROM A SOLID STATE STORAGE SUBSYSTEM - A non-volatile storage subsystem maintains, and makes available to a host system, monitor data reflective of a likelihood of a data error occurring. The monitor data may, for example, include usage statistics and/or sensor data. The storage subsystem transfers the monitor data to the host system over a signal interface that is separate from the signal interface used for standard storage operations. This interface may be implemented using otherwise unused pins/signal lines of a standard connector, such as a CompactFlash or SATA connector. Special hardware may be provided in the storage subsystem and host system for transferring the monitor data over these signal lines, so that the transfers occur with little or no need for host-software intervention. The disclosed design reduces or eliminates the need for host software that uses non-standard or “vendor-specific” commands to retrieve the monitor data.	08-13-2009
20090204854	Method for monitoring data processing system availability - A method, system, and product for monitoring the availability of a data processing system are proposed. The system runs a management application involving the periodic transmission of blocks of data from multiple local computers to a central computer. Whenever a block of data must be transmitted by a generic local computer, an expected transmission delay of a next block of data (with respect to the current one) is estimated and attached to the block of data. The central computer receiving the updated block of data can calculate an expected receiving time of the next block of data accordingly. If the next block of data is not received in due time, the central computer determines a failure of the local computer. The central computer also scans a subset of ports of the local computer, to ascertain whether the problem is due to a temporary unavailability of the application.	08-13-2009
20090210752	METHOD, SYSTEM AND COMPUTER PROGRAM PRODUCT FOR SAMPLING COMPUTER SYSTEM PERFORMANCE DATA - A system, method and computer program product for sampling computer system performance data are provided. The system includes a sample buffer to store instrumentation data while capturing trace data in a trace array, where the instrumentation data enables measurement of computer system performance. The system further includes a sample interrupt generator to assert a sample interrupt indicating that the instrumentation data is available to read. The sample interrupt is asserted in response to storing the instrumentation data in the sample buffer.	08-20-2009
20090217106	SYSTEM AND METHODS FOR RECORDING, DISPLAYING, AND RECONSTRUCTING COMPUTER-BASED SYSTEM AND USER EVENTS - A computer-implemented method for tracking computer system events and user actions is provided. The method includes detecting one or more system events of a computing system and one or more user actions performed on the computing system. The method also includes recording at least one system event and at least one user action. Additionally, the method includes synchronizing the recordation of the at least one system event and the recordation of the at least one user action. The method further includes presenting to a user the recordation of at least one system event and the recordation of at least one user action.	08-27-2009
20090240991	Automation device with diagnosis functionality - An automation device having a diagnosis program for detecting operating and/or error statuses in hardware and/or software components of the automation device is disclosed. The diagnosis program is included in a Basic Input Output System of the automation device.	09-24-2009
20090249128	PREDICTIVE DIAGNOSTICS SYSTEM, APPARATUS, AND METHOD FOR IMPROVED RELIABILITY - A system for managing a processing system and/or a processing system component is described. The system may include a wear-out module configured to provide a wear-out signal, the wear-out signal indicating a remaining amount of useful life of the component; a health module configured to provide a health signal, the health signal indicating an extent to which operational and environmental factors affect a failure rate of the component during a useful life of the component; and a mission module configured to provide a mission signal, the mission signal indicative of whether an operating condition is approaching a threshold that would adversely affect the system's ability to meet at least one performance objective.	10-01-2009
20090249129	Systems and Methods for Managing Multi-Component Systems in an Infrastructure - The present invention discloses systems and methods to maintain a multi-component system. The methods include defining a performance factor to be maintained in a given system, and collecting by agents associated with a given container in the system data associated with the performance factor. The collected data is then used to generate a statistical model that describes the normal operating condition of a given system corresponding to the desired performance factor to be monitored. The method also includes collecting real time data corresponding to the desired performance factor, and finding deviations between the real time data and parameters in the statistical model in a given time range. If a deviation is found, an alert is sent to the user to notify the user of such a deviation. The method may further include a rules engine that launches a series of workflow steps after the user alert is triggered to provide mitigating steps for the users to perform to reduce any problem in the system before such deviation causes failure of the system.	10-01-2009
20090259892	Method and Apparatus for Producing a Metastable Flip Flop - The method includes predetermining an output enable time period by measuring the maximum settling time when a signal is read during a transition from 0 to 1 or vice versa, and multiplying the maximum settling time by a safety factor 2.5, to set an output enable time period; reading and latching an input value; and transmitting the latched value onward after the predetermined output enable time period. An embodiment of the apparatus	10-15-2009
20090276666	SYSTEM, METHOD, AND ADAPTER FOR CREATING FAULT-TOLERANT COMMUNICATION BUSSES FROM STANDARD COMPONENTS - A system, method, and adapter for creating fault-tolerant communication busses from standard components, are described. Fault-tolerant interface logic is provided for transmitting and receiving system health and system management signals to and from a module that is designed to be connected to a single RS-485 bus. The fault-tolerant interface logic enables the module to selectively communicate via at least two redundant half-duplex, multipoint, differential RS-485 busses. The fault-tolerant interface logic includes a first RS-485 transceiver connected to a first RS-485 bus, a second RS-485 transceiver connected to a second RS-485 bus, selector logic responsive to a control signal for selecting one of the first and the second busses to receive signals from and for transmitting the received signals to the module, and software logic executable on a baseboard management controller (BMC) chip. The software logic includes control logic for monitoring the health of the selected bus and for providing the control signal to the selector logic.	11-05-2009
20090292954	RANKING THE IMPORTANCE OF ALERTS FOR PROBLEM DETERMINATION IN LARGE SYSTEMS - A system and method for prioritizing alerts includes extracting invariants to determine a stable set of models for determining relationships among monitored system data. Equivalent thresholds for a plurality of rules are computed using an invariant network developed by extracting the invariants. For a given time window, a set of alerts are received from a system being monitored. A measurement value of the alerts is compared with a vector of equivalent thresholds, and the set of alerts is ranked.	11-26-2009
20090300428	METHOD OF COLLECTING INFORMATION IN SYSTEM NETWORK - To quickly establish an inferring result when a problem is detected in an operation management system equipped with a rule-based inference processing function, there is provided a method of collecting information for managing a computer system equipped with a plurality of devices. The computer system holds rule for associating a plurality of events with a conclusion output when all of the plurality of events have been detected. The method includes: executing, at a first interval, polling to obtain information indicating whether each of the plurality of events has been detected; judging whether the plurality of events have been detected; and executing, upon judgment that at least one of the plurality of events has been detected and none of the remaining events have been detected, before execution of next polling at the first interval, polling to obtain information indicating whether at least one of the undetected remaining events has been detected.	12-03-2009
20090307534	Storage device and performance measurement method for the same - A storage system including a maintenance terminal, at least one disk drive, and a plurality of volumes that are provided by the at least one disk drive, and each store therein data written by the plurality of host devices. In this storage system, the maintenance terminal sets information for use to measure the performance of the storage device, and the storage device acquires the set information, measures the performance of the storage device with respect to the data stored in the plurality of volumes based on the information, and transmits, to the maintenance terminal, performance information about the performance being a measurement result. The storage system as such can collect information about the performance of a storage device that is not measurable from the side of the host devices, and a method for collecting such performance information can be provided.	12-10-2009
20090313508	MONITORING DATA CATEGORIZATION AND MODULE-BASED HEALTH CORRELATIONS - Architecture for aggregating health alerts from a number of related components into a single aggregated health state that can be analyzed to isolate the component responsible for the fault condition. In a hierarchy of related components within various component groups in a computer system, a number of health indicators can indicate alerts occurring in one or more of the related components whereas the fault condition occurs in only one component upon which the other components depend. The health indicators of related components are aggregated into an aggregated health state for each component group. These aggregated health states are analyzed to identify the related component associated with a root cause of the alert condition for an affected component group.	12-17-2009
20100011254	RISK INDICES FOR ENHANCED THROUGHPUT IN COMPUTING SYSTEMS - Embodiments of a system that adjusts a checkpointing frequency in a distributed computing system that executes multiple jobs are described. During operation, the system receives signals associated with the operation of the computing nodes. Then, the system determines risk metrics for the computing nodes using a pattern-recognition technique to identify anomalous signals in the received signals. Next, the system adjusts a checkpointing frequency of a given checkpoint for a given computing node based on a comparison of a risk metric associated with the given computing node and a threshold, thereby implementing holistic fault tolerance, in which prediction and prevention of potential faults occurs across the distributed computing system.	01-14-2010
20100011255	METHODS AND SYSTEMS FOR CONTINOUSLY ESTIMATING PERSISTENT AND INTERMITTENT FAILURE PROBABILITIES FOR PRODUCTION RESOURCES - Production control systems and methods are presented for estimation of production resource failure probabilities in which a set of four count values are maintained and updated for each resource including a first count value m	01-14-2010
20100023811	DYNAMIC ADDRESS-TYPE SELECTION CONTROL IN A DATA PROCESSING SYSTEM - A translated address and an untranslated address associated with a same processor operation are received. An address-type indicator is provided whose value is indicative of whether the translated or untranslated address is to be used for creating a debug message. The value of the address-type indicator is selectively modified in response to occurrence of one or more selected debug events. Based at least in part on the value of the address-type indicator, the translated or untranslated address is selected. The address-type indicator may be selectively overridden to select the translated or untranslated address as the selected address based on whether a process identifier is at least one of a set of process identifiers or whether at least one of the translated or untranslated address falls within one or more predetermined address ranges. A debug message is created using at least a portion of the selected address.	01-28-2010
20100023812	DEBUG TRACE MESSAGING WITH ONE OR MORE CHARACTERISTIC INDICATORS - In a data processing system, an address associated with a processing operation is received. A modified address is generated which includes a characteristic indicator within the address at a first predetermined bit position when the characteristic indicator is of a first type or at a second predetermined bit position when the characteristic indicator is of a second type. A first value of the characteristic indicator indicates a characteristic of the address. A modified address may also be generated which includes a characteristic indicator at a first predetermined bit position when a position indicator has a first value or at a second predetermined bit position when the position indicator has a second value. Address information can then be generated from the modified address, and a debug message can be created which includes the address information.	01-28-2010
20100050024	System and Method for Adaptive Load Fault Detection - In one embodiment, a method for sensing an output fault condition is disclosed. The method includes monitoring an error signal that indicates an output fault condition, and monitoring an input signal having a duration. An error flag is set if a fast switching mode is detected and if the error signal is asserted within a specified time interval during the input signal duration.	02-25-2010
20100050025	Virtual sensor network (VSN) based control system and method - A method for providing a virtual sensor network based system. The method may include obtaining project data descriptive of a virtual sensor network to be used in a control system of a machine; establishing a virtual sensor network including a plurality of virtual sensors based on the project data. Each virtual sensor may have a model type, at least one input parameter, and at least one output parameter. The method may also include recording model information, measurement data, and performance information of the virtual sensor network including the plurality of virtual sensors; creating one or more calibration certificates of the virtual sensor network including a plurality of virtual sensors based on the model information, the measurement data, and the performance information; and generating a documentation package associated with the virtual sensor network. The documentation package may include at least an identification, the project data, and at least one calibration certificate of the virtual sensor network including a plurality of virtual sensors.	02-25-2010
20100058115	READ INTERCHANGE OPTIMIZATION - A method in one embodiment includes detecting an identifier of a drive that has written data to a data storage medium; performing a data transfer operation to read the data from the data storage medium; monitoring the data transfer operation for detecting temporary errors; determining whether an error burst has occurred based on the monitoring; and if an error burst has occurred, altering a condition of the data transfer operation, the alteration being selected based on the identifier of the drive that wrote the data to the data storage medium. Additional methods and systems are also disclosed.	03-04-2010
20100058116	SYSTEM FOR PROCESSING GRAPHIC OBJECTS INCLUDING A SECURED GRAPHIC MANAGER - The general field of the invention is that of viewing systems that have to display information or images having different criticality levels. The viewing system according to the invention comprises at least one secure graphic manager with a criticality level at least equal to the highest criticality level of the graphic applications. The manager has the following detection means: violation of the segregation of the applications in their respective display window; overrunning of the processing times of each application; and violation of the specific storage spaces of the graphic applications.	03-04-2010
20100070806	TECHNOLOGIES FOR DETECTING ERRONEOUS RESUMPTIONS IN A CONTINUATION BASED RUNTIME - Technologies for enabling a continuation based runtime to accept or reject external stimulus and, in addition, to determine if an external stimulus may be valid for processing at a later point in execution.	03-18-2010
20100083054	System and Method For Dynamic Problem Determination Using Aggregate Anomaly Analysis - A system and method are provided for determining problem conditions in an IT infrastructure using aggregate anomaly analysis. The anomalies in the metrics occurring in the monitored IT infrastructure are aggregated from all resources reporting metrics as a function of time. The aggregated metric anomalies are then normalized to account for the state of the monitored IT infrastructure to provide a normalized aggregate anomaly count. A threshold noise level is then determined utilizing a variably selectable desired level of confidence such that a problem event is only determined to likely be occurring in the IT infrastructure when the normalized aggregate anomaly count exceeds the threshold noise level. The normalized aggregate anomaly count is monitored against the threshold noise level as a function of time, such that a problem event in the IT infrastructure is identified when the normalized aggregate anomaly count exceeds the threshold noise level at a given time.	04-01-2010
20100083055	Segment Based Technique And System For Detecting Performance Anomalies And Changes For A Computer Based Service - A technique includes sampling at least one performance metric of a computer-based service to form time samples of the metric(s) and detecting an occurrence of an anomaly or a performance mode change in the service. The detection includes arranging the time samples in segments based on a statistical analysis of the time samples.	04-01-2010
20100083056	PROGNOSTIC DIAGNOSTIC CAPABILITY TRACKING SYSTEM - A universal on-board system is provided for automatic fault detection and on-the-spot repair instructions that includes a module adapted to be coupled to a wide variety of platforms and Line Replaceable Units.	04-01-2010
20100088551	Method and Apparatus for Risk Analysis of Published Logs - A method and apparatus for analyzing risk associated with published logs are described. In one embodiment, the method comprises accessing a first log published to one or more logs. In one embodiment, the method may also comprise estimating a probability that an entry within the first log will not be verifiable from a second entry selected from one o the one or more logs.	04-08-2010
20100088552	Method for Obstruction and Capacity Information Unification Monitoring in Unification Management System Environment and System for Thereof - Provided are a method and system for integrated monitoring of fault and performance information in an integrated management system environment including an integrated management server that interworks with a managed server having a built-in agent for the sake of integrated management of a variety of management information. The method includes the steps of: collecting, at the agent, in real time, fault information data of the managed server using queues; periodically collecting, at the agent, performance information data of the managed server using a function-specific remote function module (REM); converting, at the agent, the fault and performance information data collected from the managed server into a format that the integrated management server can recognize and transferring it; receiving, at the integrated management server, the fault information data from the agent, and generating and transferring an event message to a corresponding administrator terminal; and receiving, at the integrated management server, the performance information data from the agent and storing it in a previously prepared database (DB). Therefore, even when a user docs not directly access a managed server, fault and performance information data is transferred in real time to the corresponding administrator so that loss due to faults can be minimized.	04-08-2010
20100095163	MONITORING ERROR NOTIFICATION FUNCTION SYSTEM - A system for monitoring error notification function comprising: an information processing apparatus including: a first processor including error notification function for generating error information indicative of an error occurred at least one component in the information processing apparatus; a first communication unit for sending the error information; and a management server including; a second communication unit for receiving the error information from the information processing apparatus; a second processor for monitoring the error notification function in accordance with a process including: instructing the information processing apparatus to generate a pseudo error command for urging the information processing apparatus to generate pseudo error information; wherein the second processor in the management server determines whether the error notification function in the system is operating properly or not by checking receipt of pseudo error information from the information processing apparatus.	04-15-2010
20100100775	Filtering Redundant Events Based On A Statistical Correlation Between Events - Methods, systems, and computer-readable media for filtering redundant fault events from an event stream generated by devices on a network based on a statistical correlation between fault events are provided. Event history data is collected from the fault events generated by devices on the network over a period of time. Statistical correlations are computed between each distinct pair of fault events in the event history data. Based on the statistical correlations, a list of redundant fault events and associated significant events is identified. The list of redundant events and associated significant events is utilized to filter the redundant fault events from the event stream generated by the devices on the network.	04-22-2010
20100122119	METHOD TO MANAGE PERFORMANCE MONITORING AND PROBLEM DETERMINATION IN CONTEXT OF SERVICE - A method to manage performance monitoring and problem determination in a context of a service application is provided. The method includes distributing performance monitoring reusable templates to the computing system that describe a set of required monitoring products, a set of scenarios with key performance indicators (KPI) relevant to the service application, and a set of best practices solutions describing how a potential performance incident is to be handled, during instantiation of the service application, deriving from the reusable templates actual performance monitoring characteristics related to various selected components of the computing system, and customizing the reusable templates to the service application in accordance with the actual performance monitoring characteristics by determining whether a number and a type of monitoring agents and/or scenarios with associated KPIs are to be changed, determining whether different KPIs exist and by determining whether solutions exist for an incident.	05-13-2010
20100122120	System And Method For Detecting Behavior Anomaly In Information Access - The present invention provides a system and method for identifying anomaly in information requests. The information requests are modeled into a plurality of basic elements and association among the basic elements are tracked. The association of one information request is compared with a plurality of bitmap tables and counters representing a baseline information from a historical behavior information. If the association of this information request differs from the baseline information, an alert is issued.	05-13-2010
20100125760	Providing protection for a memory device - A method for monitoring the status of a memory device is disclosed. The method includes, during operation of the memory device, exercising a first portion of the memory device more than at least one other portion of the memory device in order to induce an accelerated rate of aging of the first portion. The first portion is monitored to detect at least a potential for a failure in the first portion. According to the method, in response to monitoring the first portion, at least one corrective action is performed. Apparatus and computer readable media are also disclosed.	05-20-2010
20100125761	SYSTEM AND METHOD FOR WAFER-LEVEL ADJUSTMENT OF INTEGRATED CIRCUIT CHIPS - A system and method for wafer level adjusting of IC chips are disclosed. The system and method for a wafer level adjustment of IC chips connect the analog circuits of the IC chip with the adjustment controller outside the semiconductor wafer via the probing region and the signal transmission region outside the IC chip, measure and adjust the performance of the IC chip by the adjustment controller, and then, only store final adjustment data in the adjustment memory of the IC chip. Accordingly, it is possible to reduce the area of adjustment circuits added to an integrated circuit chip such as an RFID tag chip and adjust the performance of chips at a wafer level.	05-20-2010
20100138699	SCHEDULING OF CHECKS IN COMPUTING SYSTEMS - In an example embodiment, a method is provided for scheduling a check to detect anomalies in a computing system. An average time between the anomalies that are detectable by the check is identified and additionally, a runtime of the check is identified. A frequency of the check is then calculated based on the average time between the anomalies and the runtime of the check, and execution of the check may be scheduled based on the calculated frequency.	06-03-2010
20100138700	Method of remotely monitoring an internet web site - A method of performing a service which remotely monitors a Web site includes the steps of monitoring the site for an error and notifying a site representative in the event an error is detected on the site. Advance permission is not obtained prior to sending the notification and a fee is not charged for the service. The appropriate e-mail address to which the notification is sent is identified based on one or more categories and a priority assigned to all e-mail addresses identified on the monitored site. The notification may be sent, alternatively, to the representative of a site linked to the site monitored or to some other interested third party. Subscribers to the monitoring service may be enrolled automatically upon submission of their site to a search engine service or to a domain name registry. The list of service recipients generated by the monitoring service is usable for other commercial purposes.	06-03-2010
20100146342	METHOD AND SYSTEM FOR PLATFORM INDEPENDENT FAULT MANAGEMENT - A method for fault management. The method includes generating, in firmware of a computer system, a physical resource inventory (PRI) of a plurality of hardware components of the computer system, wherein the PRI defines a hierarchy of the hardware components. The method further includes traversing, by an enumerator executing in a fault manager, the PRI to generate a topology of the plurality of hardware components. The topology is used for fault management of the computer system.	06-10-2010
20100153790	PERFORMANCE TROUBLE ISOLATION SUPPORT APPARATUS - Operation information about a component of an information system is acquired by a CMDB (configuration management database). An investigation information DB stores assumption narrowing information including a plurality of inquiry items to be issued to a user to narrow a cause of a fault of an information system to a specific assumption, and assumption verification information including information necessary for verification of an assumption for each of a plurality of assumptions included in the assumption narrowing information. An inquiry item optimizing function unit refers to the assumption verification information stored in the investigation information DB and operation information stored in the CMDB, generates priority assignment information necessary in assigning a priority to the inquiry item, and assigns the priority to the inquiry item included in the assumption narrowing information on the basis of the priority assignment information.	06-17-2010
20100162051	INTEGRATION AGENT DEVICE FOR NODE INCLUDING MULTI-LAYERED EQUIPMENT AND FAULT MANAGEMENT METHOD THEREOF - An integration agent device and its fault management method for a node including multi-layered devices are disclosed to effectively control a node including two or more communication devices of different layers and integrally processing relevant fault information. The integration agent device includes: one or more control and management modules controlling and managing one or more communications network devices by layer; and an inter-layer interworking processing module integrating and processing information of the communications network devices by using inter-layer interworking information, and notifying a management system accordingly, wherein the information of the communications network devices is transmitted through the one or more control and management.	06-24-2010
20100180161	FORCED MANAGEMENT MODULE FAILOVER BY BMC IMPEACHMENT CONCENSUS - A computer-implemented method, system and computer program product for managing failover of Management Modules (MMs) in a blade chassis are presented. Each server blade in the blade chassis evaluates a performance of a primary MM. If a threshold number of server blades determine that the primary MM is not meeting pre-determined minimum performance standards, then a secondary MM impeaches the primary MM and takes over the management of the server blades.	07-15-2010
20100180162	Freeing A Serial Bus Hang Condition by Utilizing Distributed Hang Timers - A method for automatically detecting and correcting one or more hang conditions within one or more of a master device and target device of a serial bus interface when one or more signals are held in an invalid state. A hang timer monitors one or more operations of the serial bus when the serial bus is participating in a serial bus transfer. If the transfer does not end before the bus timeout value has been exceeded, the hang timer will issue a reset to the state machine forcing the state machine back to an idle state. The hang timer will also disable the serial bus drivers of the state machine, whereby the hang condition is corrected.	07-15-2010
20100192021	Method and Device for Monitoring Functions of a Computer System - The invention relates to a method and device for monitoring operations of a computer system comprising at least two execution units, wherein switching means are provided and make it is possible to switch at least two operating modes to each other and comparison means are provided, the first operating mode corresponds to the comparison mode and the second operating mode corresponds to the performance mode and the first operation is monitored by the second operation, in the comparison mode said second operation is run on at least two execution units and each second operation which is run on at least two execution units monitors the first operation.	07-29-2010
20100192022	System monitoring method and system monitoring device - A model-match-rate evaluating unit of a transaction monitoring device, which monitors a transaction system, evaluates a ratio of the number of transactions that match any models and respective processing times of all layers in the transaction are each within a corresponding normal range to the number of transactions observed per unit time as a model match rate. When the model-match-rate evaluating unit detects an abnormality of the system based on the model match rate, a suspicious-point-in-suspicious-model extracting unit of a transaction detail analyzing device extracts a point where a processing time deviates from the normal range as a suspicious point, a problematical-point evaluating unit evaluates a problem of each suspicious point as a problematical point, and a detail-analysis-result display unit displays an evaluation result of the problematical point and the suspicious point.	07-29-2010
20100223507	INFORMATION PROCESSING APPARATUS, METHOD AND COMPUTER READABLE MEDIUM - The invention provides an information processing apparatus including: a plurality of abnormality detection sections provided in each of a plurality of detection target portions, that detect an abnormality caused by high temperature at a predetermined first frequency; an indication detecting section that detects an indication that the abnormality will occur; and a controller that controls to set the detection frequency of the plurality of abnormality detection sections to a second frequency which is higher than the first frequency, when the number of times that the indication is detected within a predetermined period is more than a predetermined number of times.	09-02-2010
20100223508	INFORMATION PROCESSING APPARATUS - According to an aspect of the invention, an information processing apparatus includes a main body having a top face, a display connected to the main body by a hinge and pivotally moves between a first state where the top face is covered with the display and a second state where the top face is exposed, a counter which stores a number of times the state has changed between the first state and the second state, a monitor which detects a malfunction in the hinge when the number of times reaches a given number, and a data transmitter which sends data corresponding with the detected malfunction.	09-02-2010
20100251032	SYSTEM FOR PROVIDING PERFORMANCE TESTING INFORMATION TO USERS - A computer system includes server computers, application program storage modules, a communication network, client computers, a database storage module, and a test data presentation module. The database storage module stores event data. The event data logs events that occur in the server computers during performance test runs. The test runs are performed in connection with a plurality of projects. The projects are defined for performance testing the application programs. The test data presentation module generates a test data screen display for rendering by one of the client computers. The test data screen display includes a data table. The data table presents event data relating to events that occurred during test runs for two or more of the projects.	09-30-2010
20100251033	Method for Operating an Electronic Device - A method for operating an electronic device that is supplied with electric power by a continuous energy accumulator. A predetermined ending of the first program is monitored in a program step by a second program. If the first program is not switched off as predetermined, the second program generates an error message which is displayed immediately when the device is switched on again.	09-30-2010
20100262870	CONTEXT SWITCH SAMPLING - A method for performance monitoring in a computing system is described. In some embodiments, an addressable memory stores data and instructions for performing context switch sampling. A processor includes hardware event counters, and is coupled with the addressable memory to access said instructions and in response to said instructions, the processor counts occurrences of a first hardware event in a first hardware event counter and counts occurrences of a second hardware event in a second hardware event counter. After a specified number of occurrences of the first hardware event have been counted, the second hardware event counter is sampled and hardware event counters are reset. In some embodiments the processor counts occurrences of segment register load events in the first hardware event counter and then records the sampled second hardware event counter value with a process identifier value and/or a thread identifier value.	10-14-2010
20100268996	Systems and Methods for Predicting Failure of a Storage Medium - Various embodiments of the present invention provide systems and methods for determining storage medium health. For example, a storage device is disclosed that includes a storage medium and a data processing circuit. The data processing circuit receives a data set derived from the storage medium. The data processing circuit includes a data detector circuit, a data decoder circuit, and a health detection circuit. The data detector circuit receives the data set and provides a detected output. The data decoder circuit receives a derivative of the detected output and provides a decoded output. The health detection circuit receives an indication of a number of times that the data set is processed through the combination of the data detector circuit and the data decoder circuit. Further, the health detection circuit generates an indirect health status of the storage medium based at least in part on the number of times that the data set is processed through the combination of the data detector circuit and the data decoder circuit.	10-21-2010
20100268997	METHOD AND DEVICE FOR MONITORING AND CONTROLLING THE OPERATIONAL PERFORMANCE OF A COMPUTER PROCESSOR SYSTEM - In order to monitor and control the operational performance of a computer system or processor system (	10-21-2010
20100287419	FAULT SITUATION PROCESSING ARRANGEMENT OF A LOAD DISTRIBUTION SYSTEM OF A LOCAL ELECTRIC POWER TRANSMISSION NETWORK - A processing arrangement for the fault situation of a local electric power transmission grid comprises a generator-specific element arranged to monitor the fault situation of a data communication bus and the statuses of the switches of the electric power transmission network and to compare the status data concerning the same switch. In case of a fault situation the generators change into droop control only if there are no other possibilities to continue with the normal adjustment of the generator.	11-11-2010
20100293417	Device for centrally monitoring the operation of automated banking machines - The invention relates to an invention for the central monitoring of the operation of automated banking machines (ATM). The operating signals from actuators (	11-18-2010
20100306597	AUTOMATED IDENTIFICATION OF PERFORMANCE CRISIS - Methods for automatically identifying and classifying a crisis state occurring in a system having a plurality of computer resources. Signals are received from a device that collects the signals from each computer resource in the system. For each epoch, an epoch fingerprint is generated. Upon detecting a performance crisis within the system, a crisis fingerprint is generated consisting of at least one epoch fingerprint. The technology is able to identify that a performance crisis has previously occurred within the datacenter if a generated crisis fingerprint favorably matches any of the model crisis fingerprints stored in a database. The technology may also predict that a crisis is about to occur.	12-02-2010
20100306598	Operating Computer Memory - Operating computer memory in a computer including dynamically monitoring, by a predictive failure analysis (‘PFA’) module, correctable memory errors and memory temperature and managing cooling resources in the computer in dependence upon the correctable memory errors and memory temperature.	12-02-2010
20100318856	RECORDING MEDIUM STORING MONITORING PROGRAM, MONITORING DEVICE, AND MONITORING METHOD - A monitoring device accesses a database storing, for each of a plurality of failure cases that occurred in a monitored device, a group of past monitoring data items each representing respective measured values of monitoring items of the monitored device measured until a time of occurrence of a failure case. The device receives, from the monitored device, a current monitoring data item representing current measured values of the plurality of monitoring items. The device calculates, for each of past monitoring data items stored in the database, a similarity degree between a past monitoring data item and a current monitoring data item on the basis of the respective measured values of the plurality of monitoring items. The device determines, among the plurality of failure cases, a failure case predicted to occur in the monitored device, on the basis of the calculation result. The device outputs the determination result.	12-16-2010
20100318857	Automatic maintenance of a computing system in a steady state using correlation - An autonomic computing system is automatically maintained in a steady state. The system has a number of parameters, each of which has one or more threshold. The system may further have a number of influencers, adjustment of which affects values of the parameters. One or more of the parameters are determined as each reaching one of its threshold, and are referred to as to-be-affected parameters. Each to-be-affected parameter is identified and its thresholds identified. A correlation value may be determined between each influencer and each to-be-affected parameter, and/or between each to-be-affected parameter and each other to-be-affected parameter. The to-be-affected parameters are adjusted, based on the correlation values determined, so that the to-be-affected parameters return to more-normal values.	12-16-2010
20110022899	PRODUCING OR EXECUTING A SCRIPT FOR AN OPERATION TEST OF A TERMINAL SERVER - A method of or system for producing or executing a script for a load test of a terminal server. During execution of a high-level application by the terminal server controlled by a user of a terminal client in which the terminal client and terminal server communicate according to remote-desktop protocol, a terminal-services agent on the terminal server may monitor a change in at least one window of the high-level application within a terminal-client desktop of the terminal client. Window-related information corresponding to the monitored change from the terminal-services agent may be sent to an operation-test tool resident on the terminal client. The operation-test tool may log the received window-related information.	01-27-2011
20110041014	PROGRAM STATUS DETECTING APPARATUS AND METHOD - A method for a computer including a processor that is capable of counting invalidation of translation lookaside buffers and generating an interrupt at the occurrence of the invalidation, the invalidation being performed by an operating system upon switching between application programs, includes acquiring identification information of application programs from the operating system and storing the identification information as a first list; detecting an interrupt generated from the processor at the occurrence of switching from a first application program to a second application program; and when the interrupt is detected, acquiring the identification information of the first and second application programs from the operating system or the mechanism and comparing the acquired identification information with the first list to determine whether either of the first and second application programs is a program that has been created or disappeared.	02-17-2011

Patent applications in class Performance monitoring for fault avoidance

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Performance monitoring for fault avoidance

Subclass of:

714 - Error detection/correction and fault detection/recovery

714100000 - DATA PROCESSING SYSTEM ERROR OR FAULT HANDLING

714001000 - Reliability and availability

Patent class list (only not empty are listed)

Deeper subclasses: