Patent application number | Description | Published |
20080252441 | Method and apparatus for performing a real-time root-cause analysis by analyzing degrading telemetry signals - One embodiment of the present invention provides a system that performs a real-time root-cause-analysis for a degradation event associated with a component under test. During operation, the system monitors a telemetry signal collected from the component, and while doing so, attempts to detect an anomaly in the telemetry signal. If an anomaly is detected in the telemetry signal, the system performs a failure analysis on the telemetry signal in real-time while the telemetry signal is degrading. Next, the system identifies a failure mechanism for the component based on the failure analysis. | 10-16-2008 |
20090234484 | METHOD AND APPARATUS FOR DETECTING MULTIPLE ANOMALIES IN A CLUSTER OF COMPONENTS - A system that detects multiple anomalies in a cluster of components is presented. During operation, the system monitors derivatives obtained from one or more inferential variables which are received from sensors in the cluster of components. The system then determines whether one or more components within the cluster have experienced an anomalous event based on the monitored derivatives. If so, the system performs one or more remedial actions. | 09-17-2009 |
20090272176 | ESTIMATING RELATIVE HUMIDITY INSIDE A COMPUTER SYSTEM - One embodiment of the present invention provides a system that estimates the relative humidity inside a computer system. During operation, a set of performance parameters of the computer system and an external relative humidity outside of the computer system are monitored. Then, the relative humidity inside the computer system is estimated based on the set of performance parameters, the external relative humidity, and a relative humidity model, wherein training of the relative humidity model includes measuring an external training relative humidity outside of the computer system and a training relative humidity inside the computer system while monitoring the set of performance parameters of the computer system. | 11-05-2009 |
20090326864 | DETERMINING THE RELIABILITY OF AN INTERCONNECT - Some embodiments of the present invention provide a system that determines the reliability of an interconnect. During operation, connectors in the interconnect are categorized into a set of predetermined groups. Next, the reliability for selected groups in the set of predetermined groups is determined. Then, a reliability model for the interconnect is generated based on the selected groups and the reliability of the selected groups to determine the overall reliability of the interconnect. | 12-31-2009 |
20100121593 | IN-SITU CHARACTERIZATION OF A SOLID-STATE LIGHT SOURCE - Some embodiments of the present invention provide a system for in-situ characterization of a solid-state light. First, a voltage and a current of the solid-state light source are monitored. Then, the health of the solid-state light source is characterized based on an analysis of the monitored voltage and current. | 05-13-2010 |
20100250158 | ENHANCED CHARACTERIZATION OF ELECTRICAL CONNECTION DEGRADATION - One embodiment provides a system that analyzes an electrical connection in a computer system. During operation, the system monitors a reflection coefficient associated with the electrical connection and applies a sequential-analysis technique to the reflection coefficient to determine a statistical deviation of the reflection coefficient. Next, the system assesses the integrity of the electrical connection based on the statistical deviation of the reflection coefficient. Finally, the system uses the assessed integrity to maintain the electrical connection. | 09-30-2010 |
20100324882 | ESTIMATING BALL-GRID-ARRAY LONGEVITY IN A COMPUTER SYSTEM - A method for generating a service action for a computer system is described. During the method, a longevity index value for a packaging technology (such as solder joints in a BGA) in the computer system is calculated using thermal and vibration telemetry data (which is collected in the computer system) and a longevity model. This longevity model may be based on accelerated failure testing of the packaging technology, field failures of the packaging technology in a group of computer systems (which includes the computer system) and/or thermal and vibration telemetry data for the group of computer systems. Furthermore, using the longevity index value, the service action for the computer system is determined. Based on the longevity index value, remedial action (such as repairs to the computer system) may be scheduled and performed. | 12-23-2010 |
20130138419 | METHOD AND SYSTEM FOR THE ASSESSMENT OF COMPUTER SYSTEM RELIABILITY USING QUANTITATIVE CUMULATIVE STRESS METRICS - The disclosed embodiments provide a system that analyzes telemetry data from a computer system. During operation, the system obtains the telemetry data as a set of telemetric signals using a set of sensors in the computer system. Next, for each component or component location from a set of components in the computer system, the system applies an inferential model to the telemetry data to determine an operating environment of the component or component location, and uses the operating environment to assess a reliability of the component. Finally, the system manages use of the component in the computer system based on the assessed reliability. | 05-30-2013 |