Patent application number | Description | Published |
20090249026 | Vector instructions to enable efficient synchronization and parallel reduction operations - In one embodiment, a processor may include a vector unit to perform operations on multiple data elements responsive to a single instruction, and a control unit coupled to the vector unit to provide the data elements to the vector unit, where the control unit is to enable an atomic vector operation to be performed on at least some of the data elements responsive to a first vector instruction to be executed under a first mask and a second vector instruction to be executed under a second mask. Other embodiments are described and claimed. | 10-01-2009 |
20100005241 | DETECTION OF STREAMING DATA IN CACHE - An apparatus to detect streaming data in memory is presented. In one embodiment the apparatus use reuse bits and S-bits status for cache lines wherein an S-bit status indicates the data in the cache line are potentially streaming data. To enhance the efficiency of a cache, different measures can be applied to make the streaming data become the next victim during a replacement. | 01-07-2010 |
20100153649 | SHARED CACHE MEMORIES FOR MULTI-CORE PROCESSORS - Embodiments of shared cache memories for multi-core processors are presented. In one embodiment, a cache memory comprises a group of sampling cache sets and a controller to determine a number of misses that occur in the group of sampling cache sets. The controller is operable to determine a victim cache line for a cache set based at least in part on the number of misses. | 06-17-2010 |
20110078340 | VIRTUAL ROW BUFFERS FOR USE WITH RANDOM ACCESS MEMORY - Methods, apparatuses and systems to decrease the energy consumption of a memory chip while increasing its effect bandwidth during the execution of any workload. Methods, apparatuses and systems may allow a memory chip utilize a plurality of virtual row buffers to respond to requests for data included in a memory array block. Methods, apparatuses and systems may further eliminate or reduce the cost associated with transferring unnecessary data from a memory array block to row buffers by altering the data transfer size between a memory array block and a row buffer. | 03-31-2011 |
20110138122 | GATHER AND SCATTER OPERATIONS IN MULTI-LEVEL MEMORY HIERARCHY - Methods and apparatus relating to gather or scatter operations in a multi-level cache are described. In some embodiments, a logic may determine whether to perform gather or scatter operations at a first memory or a second memory, based in part on a relative performance of performing the gather or scatter operations at the first memory and the second memory. Other embodiments are also described and claimed. | 06-09-2011 |
20110138128 | Technique for tracking shared data in a multi-core processor or multi-processor system - A technique to track shared information in a multi-core processor or multi-processor system. In one embodiment, core identification information (“core IDs”) are used to track shared information among multiple cores in a multi-core processor or multiple processors in a multi-processor system. | 06-09-2011 |
20110145184 | METHODS AND SYSTEMS TO TRAVERSE GRAPH-BASED NETWORKS - Methods and systems to translate input labels of arcs of a network, corresponding to a sequence of states of the network, to a list of output grammar elements of the arcs, corresponding to a sequence of grammar elements. The network may include a plurality of speech recognition models combined with a weighted finite state machine transducer (WFST). Traversal may include active arc traversal, and may include active arc propagation. Arcs may be processed in parallel, including arcs originating from multiple source states and directed to a common destination state. Self-loops associated with states may be modeled within outgoing arcs of the states, which may reduce synchronization operations. Tasks may be ordered with respect to cache-data locality to associate tasks with processing threads based at least in part on whether another task associated with a corresponding data object was previously assigned to the thread. | 06-16-2011 |
20110153983 | Gathering and Scattering Multiple Data Elements - According to a first aspect, efficient data transfer operations can be achieved by: decoding by a processor device, a single instruction specifying a transfer operation for a plurality of data elements between a first storage location and a second storage location; issuing the single instruction for execution by an execution unit in the processor; detecting an occurrence of an exception during execution of the single instruction; and in response to the exception, delivering pending traps or interrupts to an exception handler prior to delivering the exception. | 06-23-2011 |
20120290799 | GATHER AND SCATTER OPERATIONS IN MULTI-LEVEL MEMORY HIERARCHY - Methods and apparatus relating to gather or scatter operations in a multi-level cache are described. In some embodiments, a logic may determine whether to perform gather or scatter operations at a first memory or a second memory, based in part on a relative performance of performing the gather or scatter operations at the first memory and the second memory. Other embodiments are also described and claimed. | 11-15-2012 |
20140156274 | METHODS AND SYSTEMS TO TRAVERSE GRAPH-BASED NETWORKS - Methods and systems to translate input labels of arcs of a network, corresponding to a sequence of states of the network, to a list of output grammar elements of the arcs, corresponding to a sequence of grammar elements. The network may include a plurality of speech recognition models combined with a weighted finite state machine transducer (WFST). Traversal may include active arc traversal, and may include active arc propagation. Arcs may be processed in parallel, including arcs originating from multiple source states and directed to a common destination state. Self-loops associated with states may be modeled within outgoing arcs of the states, which may reduce synchronization operations. Tasks may be ordered with respect to cache-data locality to associate tasks with processing threads based at least in part on whether another task associated with a corresponding data object was previously assigned to the thread. | 06-05-2014 |
20140344553 | Gathering and Scattering Multiple Data Elements - According to a first aspect, efficient data transfer operations can be achieved by: decoding by a processor device, a single instruction specifying a transfer operation for a plurality of data elements between a first storage location and a second storage location; issuing the single instruction for execution by an execution unit in the processor; detecting an occurrence of an exception during execution of the single instruction; and in response to the exception, delivering pending traps or interrupts to an exception handler prior to delivering the exception. | 11-20-2014 |