| Patent application number | Description | Published |
| 20080201620 | METHOD AND SYSTEM FOR UNCORRECTABLE ERROR DETECTION - A system, method and program product for utilizing error correction code (ECC) logic to detect multi-bit errors. In one embodiment, a first test pattern and a second test pattern are applied to a set of hardware bit positions. The first and second patterns are multiple logic level patterns and the second test pattern is the logical complement of the first test pattern. The first and second test patterns are utilized by the ECC logic to detect correctable errors having n or fewer bits. One or more bit positions of a first correctable error occurring responsive to applying the first test pattern are determined and one or more bit positions of a second correctable error occurring responsive to applying the second test pattern are determined. The determined bit positions of the first and second correctable errors are processed to identify a multiple-bit error within the set of hardware bit positions. | 08-21-2008 |
| 20090300290 | Memory Metadata Used to Handle Memory Errors Without Process Termination - Embodiments of the invention provide an interrupt handler configured to distinguish between critical and non-critical unrecoverable memory errors, yielding different actions for each. Doing so may allow a system to recover from certain memory errors without having to terminate a running process. In addition, when an operating system critical task experiences an unrecoverable error, such a task may be acting on behalf of a non-critical process (e.g., when swapping out a virtual memory page). When this occurs, an interrupt handler may respond to a memory error with the same response that would result had the process itself performed the memory operation. Further, firmware may be configured to perform diagnostics to identify potential memory errors and alert the operating system before a memory region state change occurs, such that the memory error would become critical. | 12-03-2009 |
| 20090300425 | Resilience to Memory Errors with Firmware Assistance - Embodiments of the invention provide an interrupt handler configured to distinguish between critical and non-critical unrecoverable memory errors, yielding different actions for each. Doing so may allow a system to recover from certain memory errors without having to terminate a running process. In addition, when an operating system critical task experiences an unrecoverable error, such a task may be acting on behalf of a non-critical process (e.g., when swapping out a virtual memory page). When this occurs, an interrupt handler may respond to a memory error with the same response that would result had the process itself performed the memory operation. Further, firmware may be configured to perform diagnostics to identify potential memory errors and alert the operating system before a memory region state change occurs, such that the memory error would become critical. | 12-03-2009 |
| 20090300434 | Clearing Interrupts Raised While Performing Operating System Critical Tasks - Embodiments of the invention provide an interrupt handler configured to distinguish between critical and non-critical unrecoverable memory errors, yielding different actions for each. Doing so may allow a system to recover from certain memory errors without having to terminate a running process. In addition, when an operating system critical task experiences an unrecoverable error, such a task may be acting on behalf of a non-critical process (e.g., when swapping out a virtual memory page). When this occurs, an interrupt handler may respond to a memory error with the same response that would result had the process itself performed the memory operation. Further, firmware may be configured to perform diagnostics to identify potential memory errors and alert the operating system before a memory region state change occurs, such that the memory error would become critical. | 12-03-2009 |
| 20100293437 | System to Improve Memory Failure Management and Associated Methods - A system to improve memory failure management may include memory, and an error control decoder to determine failures in the memory. The system may also include an agent that may monitor failures in the memory. The system may further include a table where the error control decoder may record the failures, and where the agent can read and write to. | 11-18-2010 |
| 20110029807 | IMPLEMENTING ENHANCED MEMORY RELIABILITY USING MEMORY SCRUB OPERATIONS - A method and circuit for implementing enhanced memory reliability using memory scrub operations to determine a frequency of intermittent correctable errors, and a design structure on which the subject circuit resides are provided. A memory scrub for intermittent performs at least two reads before moving to a next memory scrub address. A number of intermittent errors is tracked where an intermittent error is identified, responsive to identifying one failing read and one passing read of the at least two reads. | 02-03-2011 |