Patent application number | Description | Published |
20110072218 | PREFETCH PROMOTION MECHANISM TO REDUCE CACHE POLLUTION - A processor is disclosed. The processor includes an execution core, a cache memory, and a prefetcher coupled to the cache memory. The prefetcher is configured to fetch a first cache line from a lower level memory and to load the cache line into the cache. The cache is further configured to designate the cache line as a most recently used (MRU) cache line responsive to the execution core asserting N demand requests for the cache line, wherein N is an integer greater than 1. The cache is configured to inhibit the cache line from being promoted to the MRU position if it receives fewer than N demand requests. | 03-24-2011 |
20130013867 | DATA PREFETCHER MECHANISM WITH INTELLIGENT DISABLING AND ENABLING OF A PREFETCHING FUNCTION - A data prefetcher includes a controller to control operation of the data prefetcher. The controller receives data associated with cache misses and data associated with events that do not rely on a prefetching function of the data prefetcher. The data prefetcher also includes a counter to maintain a count associated with the data prefetcher. The count is adjusted in a first direction in response to detection of a cache miss, and in a second direction in response to detection of an event that does not rely on the prefetching function. The controller disables the prefetching function when the count reaches a threshold value. | 01-10-2013 |
20130173885 | Processor and Methods of Adjusting a Branch Misprediction Recovery Mode - A processor core includes a fetch control unit for fetching instructions and placing the instructions into an instruction queue and includes a branch predictor for controlling the fetch control unit to speculatively fetch at least one instruction subsequent to an unresolved branch instruction. The processor further includes a controller configured to dispatch instructions from the instruction queue and, in response to a branch misprediction of an unresolved control instruction, to apply a selected one of a checkpointing-based recovery mode and a commit-time-based recovery mode. | 07-04-2013 |
20130238861 | MULTIMODE PREFETCHER - One or more lines of a cache are prefetched according to a first prefetch routine while training a prefetcher to prefetch one or more lines of the cache according to a second prefetch routine. In response to determining that the prefetcher has been trained, one or more lines of the cache may be prefetched according to the second prefetch routine. | 09-12-2013 |
20130262780 | Apparatus and Method for Fast Cache Shutdown - An apparatus and method to enable a fast cache shutdown is disclosed. In one embodiment, a cache subsystem includes a cache memory and a cache controller coupled to the cache memory. The cache controller is configured to, upon restoring power to the cache subsystem, inhibit writing of modified data exclusively into the cache memory. | 10-03-2013 |
20140089699 | POWER MANAGEMENT SYSTEM AND METHOD FOR A PROCESSOR - The present disclosure relates to a method and apparatus for dynamically controlling power consumption by at least one processor. A power management method includes monitoring, by power control logic of the at least one processor, performance data associated with each of a plurality of executions of a repetitive workload by the at least one processor. The method includes adjusting, by the power control logic following an execution of the repetitive workload, an operating frequency of at least one of a compute unit and a memory controller upon a determination that the at least one processor is at least one of compute-bound and memory-bound based on monitored performance data associated with the execution of the repetitive workload. | 03-27-2014 |
20140136870 | TRACKING MEMORY BANK UTILITY AND COST FOR INTELLIGENT SHUTDOWN DECISIONS - A device receives an indication that a memory bank is to be powered down, and determines, based on receiving the indication, shutdown scores corresponding to powered up memory banks. Each shutdown score is based on a shutdown metric associated with powering down a powered up memory bank. The device may power down a selected memory bank based on the shutdown scores. | 05-15-2014 |
20140136873 | TRACKING MEMORY BANK UTILITY AND COST FOR INTELLIGENT POWER UP DECISIONS - A device receives an indication that a memory bank is to be powered up, and determines, based on receiving the indication, power scores corresponding to powered down memory banks. Each power score corresponds to a power metric associated with powering up a powered down memory bank. The device powers up a selected memory bank based on the plurality of power scores. | 05-15-2014 |
20140143492 | Using Predictions for Store-to-Load Forwarding - The described embodiments include a core that uses predictions for store-to-load forwarding. In the described embodiments, the core comprises a load-store unit, a store buffer, and a prediction mechanism. During operation, the prediction mechanism generates a prediction that a load will be satisfied using data forwarded from the store buffer because the load loads data from a memory location in a stack. Based on the prediction, the load-store unit first sends a request for the data to the store buffer in an attempt to satisfy the load using data forwarded from the store buffer. If data is returned from the store buffer, the load is satisfied using the data. However, if the attempt to satisfy the load using data forwarded from the store buffer is unsuccessful, the load-store unit then separately sends a request for the data to a cache to satisfy the load. | 05-22-2014 |
20140143495 | METHODS AND APPARATUS FOR SOFT-PARTITIONING OF A DATA CACHE FOR STACK DATA - A method of partitioning a data cache comprising a plurality of sets, the plurality of sets comprising a plurality of ways, is provided. Responsive to a stack data request, the method stores a cache line associated with the stack data in one of a plurality of designated ways of the data cache, wherein the plurality of designated ways is configured to store all requested stack data. | 05-22-2014 |
20140143498 | METHODS AND APPARATUS FOR FILTERING STACK DATA WITHIN A CACHE MEMORY HIERARCHY - A method of storing stack data in a cache hierarchy is provided. The cache hierarchy comprises a data cache and a stack filter cache. Responsive to a request to access a stack data block, the method stores the stack data block in the stack filter cache, wherein the stack filter cache is configured to store any requested stack data block. | 05-22-2014 |
20140143499 | METHODS AND APPARATUS FOR DATA CACHE WAY PREDICTION BASED ON CLASSIFICATION AS STACK DATA - A method of way prediction for a data cache having a plurality of ways is provided. Responsive to an instruction to access a stack data block, the method accesses identifying information associated with a plurality of most recently accessed ways of a data cache to determine whether the stack data block resides in one of the plurality of most recently accessed ways of the data cache, wherein the identifying information is accessed from a subset of an array of identifying information corresponding to the plurality of most recently accessed ways; and when the stack data block resides in one of the plurality of most recently accessed ways of the data cache, the method accesses the stack data block from the data cache. | 05-22-2014 |
20140143565 | Setting Power-State Limits based on Performance Coupling and Thermal Coupling between Entities in a Computing Device - The described embodiments include a computing device with a first entity and a second entity. In the computing device, a management controller dynamically sets a power-state limit for the first entity based on a performance coupling and a thermal coupling between the first entity and the second entity. | 05-22-2014 |
20140149772 | Using a Linear Prediction to Configure an Idle State of an Entity in a Computing Device - The described embodiments include a computing device with one or more entities (processor cores, processors, etc.). In some embodiments, during operation, a thermal power management unit in the computing device uses a linear prediction to compute a predicted duration of a next idle period for an entity based on the duration of one or more previous idle periods for the entity. Based on the predicted duration of the next idle period, the thermal power management unit configures the entity to operate in a corresponding idle state. | 05-29-2014 |
20140164708 | SPILL DATA MANAGEMENT - A processor discards spill data from a memory hierarchy in response to the final access to the spill data has been performed by a compiled program executing at the processor. In some embodiments, the final access determined based on a special-purpose load instruction configured for this purpose. In some embodiments the determination is made based on the location of a stack pointer indicating that a method of the executing program has returned, so that data of the returned method that remains in the stack frame is no longer to be accessed. Because the spill data is discarded after the final access, it is not transferred through the memory hierarchy. | 06-12-2014 |
20140181410 | MANAGEMENT OF CACHE SIZE - In response to a processor core exiting a low-power state, a cache is set to a minimum size so that fewer than all of the cache's entries are available to store data, thus reducing the cache's power consumption. Over time, the size of the cache can be increased to account for heightened processor activity, thus ensuring that processing efficiency is not significantly impacted by a reduced cache size. In some embodiments, the cache size is increased based on a measured processor performance metric, such as an eviction rate of the cache. In some embodiments, the cache size is increased at regular intervals until a maximum size is reached. | 06-26-2014 |
20140181413 | METHOD AND SYSTEM FOR SHUTTING DOWN ACTIVE CORE BASED CACHES - A system and method are presented. Some embodiments include a processing unit, at least one memory coupled to the processing unit, and at least one cache coupled to the processing unit and divided into a series of blocks, wherein at least one of the series of cache blocks includes data identified as being in a modified state. The modified state data is flushed by writing the data to the at least one memory based on a write back policy and the aggressiveness of the policy is based on at least one factor including the number of idle cores, the proximity of the last cache flush, the activity of the thread associated with the data, and which cores are idle and if the idle core is associated with the data. | 06-26-2014 |
20140181414 | MECHANISMS TO BOUND THE PRESENCE OF CACHE BLOCKS WITH SPECIFIC PROPERTIES IN CACHES - A system and method for efficiently limiting storage space for data with particular properties in a cache memory. A computing system includes a cache array and a corresponding cache controller. The cache array includes multiple banks, wherein a first bank is powered down. In response a write request to a second bank for data indicated to be stored in the powered down first bank, the cache controller determines a respective bypass condition for the data. If the bypass condition exceeds a threshold, then the cache controller invalidates any copy of the data stored in the second bank. If the bypass condition does not exceed the threshold, then the cache controller stores the data with a clean state in the second bank. The cache controller writes the data in a lower-level memory for both cases. | 06-26-2014 |
20140181537 | GUARDBAND REDUCTION FOR MULTI-CORE DATA PROCESSOR - A multi-core data processor includes multiple data processor cores and a power controller. Each data processor core has a first input for receiving a clock signal, a second input for receiving a power supply voltage, and an output for providing an idle signal. The power controller is coupled to each of the data processor cores for providing the clock signal and the power supply voltage to each of the data processor cores. The power controller provides at least one of the clock signal and the power supply voltage to an active one of the data processor cores in dependence on a number of idle signals received from the data processor cores. | 06-26-2014 |
20140181553 | Idle Phase Prediction For Integrated Circuits - A method and apparatus for idle phase prediction in integrated circuits is disclosed. In one embodiment, an integrated circuit (IC) includes a functional unit configured to cycle between intervals of an active state and an idle state. The IC further includes a prediction unit configured to record a history of idle state durations for a plurality of intervals of the idle state. Based on the history of idle state durations, the prediction unit is configured to generate a prediction of the duration of the next interval of the idle state. The prediction may be used by a power management unit to, among other uses, determine whether to place the functional unit in a low power (e.g., sleep) state. | 06-26-2014 |
20140181554 | POWER CONTROL FOR MULTI-CORE DATA PROCESSOR - A multi-core data processor includes multiple data processor cores and a circuit. The multiple data processor cores each include a power state controller having a first input for receiving an idle signal, a second input for receiving a release signal, a third input for receiving a control signal, and an output for providing a current power state. In response to the idle signal, the power state controller causes a corresponding data processor core to enter an idle state. In response to the release signal, the power state controller changes the current power state from the idle state to an active state in dependence on the control signal. The circuit is coupled to each of the multiple data processor cores for providing the control signal in response to current power states in the multiple data processor cores. | 06-26-2014 |
20140181556 | Idle Phase Exit Prediction - A method and apparatus for exiting a low power state based on a prior prediction is disclosed. An integrated circuit (IC) includes a functional unit configured to, during operation, cycle between intervals of an active state and intervals of an idle state. The IC also include a power management unit configured to place the functional unit in a low power state responsive to the functional unit entering the idle state. The power management unit is further configured to preemptively cause the functional unit to exit the low power state at a predetermined time after entering the low power. The predetermined time is based on a prediction of idle state duration made prior to entering the low power state. The prediction may be generated by a prediction unit, based on a history of durations of intervals in which the functional unit was in the idle state. | 06-26-2014 |
20140297961 | SELECTIVE CACHE FILLS IN RESPONSE TO WRITE MISSES - A cache memory receives a request to perform a write operation. The request specifies an address. A first determination is made that the cache memory does not include a cache line corresponding to the address. A second determination is made that the address is between a previous value of a stack pointer and a current value of the stack pointer. A third determination is made that a write history indicator is set to a specified value. The write operation is performed in the cache memory without waiting for a cache fill corresponding to the address to be performed, in response to the first, second, and third determinations. | 10-02-2014 |
20150067264 | METHOD AND APPARATUS FOR MEMORY MANAGEMENT - In some embodiments, a method of managing cache memory includes identifying a group of cache lines in a cache memory, based on a correlation between the cache lines. The method also includes tracking evictions of cache lines in the group from the cache memory and, in response to a determination that a criterion regarding eviction of cache lines in the group from the cache memory is satisfied, selecting one or more (e.g., all) remaining cache lines in the group for eviction. | 03-05-2015 |
20150067266 | EARLY WRITE-BACK OF MODIFIED DATA IN A CACHE MEMORY - A level of cache memory receives modified data from a higher level of cache memory. A set of cache lines with an index associated with the modified data is identified. The modified data is stored in the set in a cache line with an eviction priority that is at least as high as an eviction priority, before the modified data is stored, of an unmodified cache line with a highest eviction priority among unmodified cache lines in the set. | 03-05-2015 |
20150067305 | SPECIALIZED MEMORY DISAMBIGUATION MECHANISMS FOR DIFFERENT MEMORY READ ACCESS TYPES - A system and method for efficient predicting and processing of memory access dependencies. A computing system includes control logic that marks a detected load instruction as a first type responsive to predicting the load instruction has high locality and is a candidate for store-to-load (STL) data forwarding. The control logic marks the detected load instruction as a second type responsive to predicting the load instruction has low locality and is not a candidate for STL data forwarding. The control logic processes a load instruction marked as the first type as if the load instruction is dependent on an older store operation. The control logic processes a load instruction marked as the second type as if the load instruction is independent on any older store operation. | 03-05-2015 |
20150067356 | POWER MANAGER FOR MULTI-THREADED DATA PROCESSOR - A data processing system includes a plurality of processor resources, a manager, and a power distributor. Each of the plurality of data processor cores is operable at a selected one of a plurality of performance states. The manager assigns each of a plurality of program elements to one of the plurality of processor resources, and synchronizing the program elements using barriers. The power distributor is coupled to the manager and to the plurality of processor resources, and assigns a performance state to each of the plurality of processor resources within an overall power budget, and in response to detecting that a program element assigned to a first processor resource is at a barrier, increases the performance state of a second processor resource that is not at the barrier within the overall power budget. | 03-05-2015 |
20150185801 | POWER GATING BASED ON CACHE DIRTINESS - Power gating decisions can be made based on measures of cache dirtiness. Analyzer logic can selectively power gate a component of a processor system based on a cache dirtiness of one or more caches associated with the component. The analyzer logic may power gate the component when the cache dirtiness exceeds a threshold and may maintains the component in an idle state when the cache dirtiness does not exceed the threshold. Idle time prediction logic may be used to predict a duration of an idle time of the component. The analyzer logic may then selectively power gates the component based on the cache dirtiness and the predicted idle time. | 07-02-2015 |
20150186160 | CONFIGURING PROCESSOR POLICIES BASED ON PREDICTED DURATIONS OF ACTIVE PERFORMANCE STATES - Durations of active performance states of components of a processing system can be predicted based on one or more previous durations of an active state of the components. One or more entities in the processing system such as processor cores or caches can be configured based on the predicted durations of the active state of the components. Some embodiments configure a first component in a processing system based on a predicted duration of an active state of a second component of the processing system. The predicted duration is predicted based on one or more previous durations of an active state of the second component. | 07-02-2015 |
20150188649 | METHODS AND SYSTEMS OF SYNCHRONIZER SELECTION - A circuit includes a plurality of synchronizers to adapt a signal from a first clock domain to a second clock domain. Each synchronizer of the plurality of synchronizers includes a synchronizer input to receive the signal from the first clock domain and a synchronizer output to provide the signal as adapted to the second clock domain. The circuit also includes a multiplexer (mux) that includes a plurality of mux inputs and a mux output. Each mux input is coupled to the synchronizer output of a respective synchronizer of the plurality of synchronizers. The mux output provides the signal, as adapted to the second clock domain, from the synchronizer output of a selected synchronizer of the plurality of synchronizers. | 07-02-2015 |