Entries |
Document | Title | Date |
20080215817 | MEMORY MANAGEMENT SYSTEM AND IMAGE PROCESSING APPARATUS - A memory management system includes a plurality of processors, a shared memory that can be accessed from the plurality of processors, cache memories provided between each processor of the plurality of processors and the shared memory and invalidation or write back of a specified region can be commanded from a program running on a processor. Programs running on each processor invalidate an input data region of a cache memory with an invalidation command immediately before execution of a program as a processing batch, and write back an output data region of a cache memory to the shared memory with a write back command immediately after execution of a program as a processing batch. | 09-04-2008 |
20080235456 | Shared Cache Eviction - Methods and systems for shared cache eviction in a multi-core processing environment having a cache shared by a plurality of processor cores are provided. Embodiments include receiving from a processor core a request to load a cache line in the shared cache; determining whether the shared cache is full; determining whether a cache line is stored in the shared cache that has been accessed by fewer than all the processor cores sharing the cache if the shared cache is full; and evicting a cache line that has been accessed by fewer than all the processor cores sharing the cache if a cache line is stored in the shared cache that has been accessed by fewer than all the processor cores sharing the cache. | 09-25-2008 |
20080235457 | Dynamic quality of service (QoS) for a shared cache - In one embodiment, the present invention includes a method for associating a first priority indicator with data stored in a first entry of a shared cache memory by a core to indicate a priority level of a first thread, and associating a second priority indicator with data stored in a second entry of the shared cache memory by a graphics engine to indicate a priority level of a second thread. Other embodiments are described and claimed. | 09-25-2008 |
20080244184 | In-memory caching of shared customizable multi-tenant data - In a multi-tenant data sharing environment with shared, customizable data attributes are assigned to requested data and stored in a cache store along with the requested data. For non-customized data designated as system data, one copy is stored in the cache store for use by multiple tenants allowing optimization of memory and performance for each data request/retrieval operation. A “delete sentinel” attribute may be assigned to non-existing data in the cache store enabling notification of requesting tenant(s) without a need to access the tenant data store each time a request for the non-existing data is received. | 10-02-2008 |
20080288722 | Method for Optimization of the Management of a Server Cache Which May be Consulted by Client Terminals with Differing Characteristics - A method is provided for optimisation of the management of a server cache for dynamic pages, which may be consulted by client terminals with differing characteristics which requires the provision of discrete versions of a dynamic page in the cache. When a terminal requests a dynamic page, a verification step—for the presence of at least one version of the dynamic page in the cache is carried out, such that if the verification is positive the following complementary steps are carried out: procurement of a set of characteristics specific to the type of client terminal, determination of a subset of necessary characteristics from amongst the specific characteristics for the reproduction of the dynamic page on a client terminal, search, among the version(s) of the dynamic page in the cache for a suitable version using the subset of necessary characteristics and allocation of the suitable version to the client terminal. | 11-20-2008 |
20080320225 | SYSTEMS AND METHODS FOR CACHING AND SERVING DYNAMIC CONTENT - A web server and a shared caching server are described for serving dynamic content to users of at least two different types, where the different types of users receive different versions of the dynamic content. A version of the dynamic content includes a validation header, such as an ETag, that stores information indicative of the currency of the dynamic content and information indicative of a user type for which the version of the dynamic content is intended. In response to a user request for the dynamic content, the shared caching server sends a validation request to the web server with the validation header information. The web server determines, based on the user type of the requestor and/or on the currency of the cached dynamic content whether to instruct the shared caching server to send the cached content or to send updated content for serving to the user. | 12-25-2008 |
20090013133 | CACHE LINE MARKING WITH SHARED TIMESTAMPS - Embodiments of the present invention provide a system that marks cache lines using shared timestamps. During operation, the system starts a transaction for a thread, wherein starting the transaction involves recording the value of an active timestamp and incrementing a transaction or overflow counter (TO_counter) corresponding to the recorded value. The system then places load-marks on cache lines which are loaded during the transaction. While placing the load-marks, the system writes the recorded value into metadata corresponding to the cache lines. Upon completing the transaction for the thread, the system decrements the TO_counter corresponding to the recorded value and resumes non-transactional execution for the thread without removing the load-marks from cache lines which were load-marked during the transaction. | 01-08-2009 |
20090024799 | Technique for preserving cached information during a low power mode - A technique to retain cached information during a low power mode, according to at least one embodiment. In one embodiment, information stored in a processor's local cache is saved to a shared cache before the processor is placed into a low power mode, such that other processors may access information from the shared cache instead of causing the low power mode processor to return from the low power mode to service an access to its local cache. | 01-22-2009 |
20090077320 | DIRECT ACCESS OF CACHE LOCK SET DATA WITHOUT BACKING MEMORY - Apparatus and system for quickly accessing data residing in a cache of one processor, by another processor, while avoiding lengthy accesses to main memory are provided. A portion of the cache may be placed in a lock set mode by the processor in which it resides. While in the lock set mode, this portion of the cache may be accessed directly by another processor without lengthy “backing” writes of the accessed data to main memory. | 03-19-2009 |
20090083490 | System to Improve Data Store Throughput for a Shared-Cache of a Multiprocessor Structure and Associated Methods - A system to improve data store throughput for a shared-cache of a multiprocessor structure that may include a controller to find and compare a last data store address for a last data store with a next data store address for a next data store. The system may also include a main pipeline to receive the last data store, and to receive the next data store if the next data store address differs substantially from the last data store address. The system may further include a store pipeline to receive the next data store if the next data store address is substantially similar to the last data store address. | 03-26-2009 |
20090106495 | FAST INTER-STRAND DATA COMMUNICATION FOR PROCESSORS WITH WRITE-THROUGH L1 CACHES - A method is disclosed that uses a non-coherent store instruction to reduce inter-thread communication latency between threads sharing a level one write-through cache. When a thread executes the non-coherent store instruction, the level one cache is immediately updated with the data value. The data value is immediately available to another thread sharing the level-one write-through cache. A computer system having reduced inter-thread communication latency is disclosed. The computer system includes a first plurality of processor cores, each processor core including a second plurality of processing engines sharing a level one write-through cache. The level one caches are connected to a level two cache via a crossbar switch. The computer system further implements a non-coherent store instruction that updates a data value in the level one cache prior to updating the corresponding data value in the level two cache. | 04-23-2009 |
20090119459 | LATE LOCK ACQUIRE MECHANISM FOR HARDWARE LOCK ELISION (HLE) - A method and apparatus for a late lock acquire mechanism is herein described. In response to detecting a late-lock acquire event, such as expiration of a timer, a full cachet set, and an irrevocable event, a late-lock acquire may be initiated. Consecutive critical sections are stalled until a late-lock acquire is completed utilizing fields of access buffer entries associated with consecutive critical section operations. | 05-07-2009 |
20090138660 | Power-aware line intervention for a multiprocessor snoop coherency protocol - A snoop coherency method, system and program are provided for intervening a requested cache line from a plurality of candidate memory sources in a multiprocessor system on the basis of the sensed temperature or power dissipation value at each memory source. By providing temperature or power dissipation sensors in each of the candidate memory sources (e.g., at cores, cache memories, memory controller, etc.) that share a requested cache line, control logic may be used to determine which memory source should source the cache line by using the power sensor signals to signal only the memory source with acceptable power dissipation to provide the cache line to the requester. | 05-28-2009 |
20090157970 | METHOD AND SYSTEM FOR INTELLIGENT AND DYNAMIC CACHE REPLACEMENT MANAGEMENT BASED ON EFFICIENT USE OF CACHE FOR INDIVIDUAL PROCESSOR CORE - Determining and applying a cache replacement policy for a computer application running in a computer processing system is accomplished by receiving a processor core data request, adding bits on each cache line of a plurality of cache lines to identify a core ID of an at least one processor core that provides each cache line in a shared cache, allocating a tag table for each processor core, where the tag table keeps track of an index of processor core miss rates, and setting a threshold to define a level of cache usefulness, depending on whether or not the index of processor core miss rates exceeds the threshold. Checking the threshold and when the threshold is not exceeded, then a shared cache standard policy for cache replacement is applied. When the threshold is exceeded, then the cache line from the processor core running the application is evicted from the shared cache. | 06-18-2009 |
20090164731 | SYSTEM AND METHOD FOR OPTIMIZING NEIGHBORING CACHE USAGE IN A MULTIPROCESSOR ENVIRONMENT - A method for managing data operates in a data processing system with a system memory and a plurality of processing units (PUs), each PU having a cache comprising a plurality of cache lines, each cache line having one of a plurality of coherency states, and each PU coupled to at least another one of the plurality of PUs. A first PU selects a castout cache line of a plurality of cache lines in a first cache of the first PU to be castout of the first cache. The first PU sends a request to a second PU, wherein the second PU is a neighboring PU of the first PU, and the request comprises a first address and first coherency state of the selected castout cache line. The second PU determines whether the first address matches an address of any cache line in the second PU. The second PU sends a response to the first PU based on a coherency state of each of a plurality of cache lines in the second cache and whether there is an address hit. The first PU determines whether to transmit the castout cache line to the second PU based on the response. And, in the event the first PU determines to transmit the castout cache line to the second PU, the first PU transmits the castout cache line to the second PU. | 06-25-2009 |
20090164732 | CACHE MEMORY SYSTEM AND CACHE MEMORY CONTROL METHOD - A cache memory system, which is individually connected to each of a plurality of arithmetic units that access a shared memory to carry out parallel processing, includes: a data array that has a plurality of blocks that are composed of a plurality of words; a storage unit that, with respect to a block, which stores data in at least one of the words, among the plurality of blocks, stores an address group of the shared memory that is placed in correspondence with that block; a write unit that, when an address from said arithmetic unit is not in the storage unit at the time of writing of data from the arithmetic unit, allocates any of the plurality of blocks as a block for writing, places any word in that block for writing in correspondence with the address, and writes the data from the arithmetic unit to the word; a word state storage unit that stores word state information for specifying a word, into which the data from the arithmetic unit have been written, in association with an address that has been placed in correspondence with the word; and a data transfer unit that, when the block for writing is replaced with a different block, refers to the word state storage unit, specifies one or a plurality of words, into which the data have been written, within the block for writing, and performs write-back of data in the one or plurality of specified words to a corresponding block in the shared memory. | 06-25-2009 |
20090187713 | UTILIZING CACHE INFORMATION TO MANAGE MEMORY ACCESS AND CACHE UTILIZATION - A method and system of managing data access in a shared memory cache of a processor are disclosed. The method includes probing one or more memory addresses that map to a subset of the shared memory cache and sensing a plurality of events in the one or more memory addresses. Cache utilization information is then obtained by reading a hardware performance counter of the processor. The hardware performance counter is incremented based on the occurrence of the plurality of events. Based upon the cache utilization information, an occurrence of one of the plurality of events is reduced. | 07-23-2009 |
20090222625 | CACHE MISS DETECTION IN A DATA PROCESSING APPARATUS - A data processing apparatus and method are provided for detecting cache misses. The data processing apparatus has processing logic for executing a plurality of program threads, and a cache for storing data values for access by the processing logic. When access to a data value is required while executing a first program thread, the processing logic issues an access request specifying an address in memory associated with that data value, and the cache is responsive to the address to perform a lookup procedure to determine whether the data value is stored in the cache. Indication logic is provided which in response to an address portion of the address provides an indication as to whether the data value is stored in the cache, this indication being produced before a result of the lookup procedure is available, and the indication logic only issuing an indication that the data value is not stored in the cache if that indication is guaranteed to be correct. Control logic is then provided which, if the indication indicates that the data value is not stored in the cache, uses that indication to control a process having an effect on a program thread other than the first program thread. | 09-03-2009 |
20100005244 | Device and Method for Storing Data and/or Instructions in a Computer System Having At Least Two Processing Units and At Least One First Memory or Memory Area for Data and/or Instructions - A device and method for storing data and/or instructions in a computer system having at least two processing units and at least one first memory or memory area for data and/or instructions, wherein a second memory or memory area is included in the device, the device being designed as a cache memory system and equipped with at least two separate ports, and the at least two processing units accessing via these ports the same or different memory cells of the second memory or memory area, the data and/or instructions from the first memory system being stored temporarily in blocks. | 01-07-2010 |
20100011167 | Heterogeneous processors sharing a common cache - A multi-core processor providing heterogeneous processor cores and a shared cache is presented. | 01-14-2010 |
20100049921 | Distributed Shared Caching for Clustered File Systems - Systems and methods for distributed shared caching in a clustered file system, wherein coordination between the distributed caches, their coherency and concurrency management, are all done based on the granularity of data segments rather than files. As a consequence, this new caching system and method provides enhanced performance in an environment of intensive access patterns to shared files. | 02-25-2010 |
20100077150 | ADVANCED PROCESSOR WITH CACHE COHERENCY - An advanced processor comprises a plurality of multithreaded processor cores each having a data cache and instruction cache. A data switch interconnect is coupled to each of the processor cores and configured to pass information among the processor cores. A messaging network is coupled to each of the processor cores and a plurality of communication ports. In one aspect of an embodiment of the invention, the data switch interconnect is coupled to each of the processor cores by its respective data cache, and the messaging network is coupled to each of the processor cores by its respective message station. Advantages of the invention include the ability to provide high bandwidth communications between computer systems and memory in an efficient and cost-effective manner. | 03-25-2010 |
20100088471 | FIELD DEVICE - Disclosed is a field device comprising: a storage section to store shared data which is shared between user modules; an interface section to obtain trigger information to access the shared data, and to output access information which includes an access content of the shared data and an access request of the shared data, based on the obtained trigger information; and a control section to access the shared data which is stored in the storage section, based on the access information which has been output by the interface section. | 04-08-2010 |
20100088472 | DATA PROCESSING SYSTEM AND CACHE CONTROL METHOD - A data processing system is provided. The data processing system includes a plurality of processors, a cache memory shared by the plurality of processors, in which memory a cache line is divided into a plurality of partial writable regions. The plurality of processors are given exclusive access rights to the partial writable region waits. | 04-08-2010 |
20100095068 | CACHE MEMORY CONTROL DEVICE AND PIPELINE CONTROL METHOD - With a view to reducing the congestion of a pipeline for cache memory access in, for example, a multi-core system, a cache memory control device includes: a determination unit for determining whether or not a command provided from, for example, each core is to access cache memory during the execution of the command; and a path switch unit for putting a command determined as accessing the cache memory in pipeline processing, and outputting a command determined as not accessing the cache memory directly to an external unit without putting the command in the pipeline processing. | 04-15-2010 |
20100100686 | Cache controller and control method - In such a configuration that a port unit is provided which takes a form being shared among threads and has a plurality of entries for holding access requests, and the access requests for a cache shared by a plurality of threads being executed at the same time are controlled using the port unit, the access request issued from each tread is registered on a port section of the port unit which is assigned to the tread, thereby controlling the port unit to be divided for use in accordance with the thread configuration. In selecting the access request, the access requests are selected for each thread based on the specified priority control from among the access requests issued from the threads held in the port unit, thereafter a final access request is selected in accordance with a thread selection signal from among those selected access requests. In accordance with such a configuration, the cache access processing can be carried out while reducing the amount of resources of the port unit and assuring effective use of such resources. | 04-22-2010 |
20100115204 | NON-UNIFORM CACHE ARCHITECTURE (NUCA) - In one embodiment, a cache memory includes a cache array including a plurality of entries for caching cache lines of data, where the plurality of entries are distributed between a first region implemented in a first memory technology and a second region implemented in a second memory technology. The cache memory further includes a cache directory of the contents of the cache array and a cache controller that controls operation of the cache memory. | 05-06-2010 |
20100131716 | CACHE MEMORY SHARING IN A MULTI-CORE PROCESSOR (MCP) - This invention describes an apparatus, computer architecture, memory structure, memory control, and cache memory operation method for multi-core processor. A logic core shares requests when faced with immediate cache memory units having low yield or deadly performance. The core mounts (multiple) cache unit(s) that might already be in use by other logic cores. Selected cache memory units serve multiple logic cores with the same contents. The shared cache memory unit(s) serves all the mounting cores with cache search, hit, miss, and write back functions. The method recovers a logic core whose cache memory block is not operational by sharing cache memory blocks which might already engage other logic cores. The method is used to improve reliability and performance of the remaining system. | 05-27-2010 |
20100138612 | SYSTEM AND METHOD FOR IMPLEMENTING CACHE SHARING - A system for implementing cache sharing includes a main control unit and a plurality of service processing units, and further includes a shared cache unit respectively connected with the main control unit and the service processing units for implementing high-speed data interaction among the service processing units. A method for cache sharing is also provided. In embodiments of the present invention, based on a reliable high-speed bus, a high-speed shared cache is provided. A mutual exclusion scheme is provided in the shared cache to ensure data consistency, which not only implements high-speed data sharing but also dramatically improves system performance. | 06-03-2010 |
20100153649 | SHARED CACHE MEMORIES FOR MULTI-CORE PROCESSORS - Embodiments of shared cache memories for multi-core processors are presented. In one embodiment, a cache memory comprises a group of sampling cache sets and a controller to determine a number of misses that occur in the group of sampling cache sets. The controller is operable to determine a victim cache line for a cache set based at least in part on the number of misses. | 06-17-2010 |
20100174868 | Processor device having a sequential data processing unit and an arrangement of data processing elements - Designing a coupling of a traditional processor, in particular a sequential processor, and a reconfigurable field of data processing units, in particular a runtime-reconfigurable field of data processing units is described. | 07-08-2010 |
20100185817 | Methods and Systems for Implementing Transcendent Page Caching - This disclosure describes, generally, methods and systems for implementing transcendent page caching. The method includes establishing a plurality of virtual machines on a physical machine. Each of the plurality of virtual machines includes a private cache, and a portion of each of the private caches is used to create a shared cache maintained by a hypervisor. The method further includes delaying the removal of the at least one of stored memory pages, storing the at least one of stored memory pages in the shared cache, and requesting, by one of the plurality of virtual machines, the at least one of the stored memory pages from the shared cache. Further, the method includes determining that the at least one of the stored memory pages is stored in the shared cache, and transferring the at least one of the stored shared memory pages to the one of the plurality of virtual machines. | 07-22-2010 |
20100185818 | RESOURCE POOL MANAGING SYSTEM AND SIGNAL PROCESSING METHOD - A resource pool managing system and a signal processing method are provided in embodiments of the present disclosure. On the basis of the resource pool, all filters on links share one set of operation resources and cached resources. The embodiment can be adapted to support different application scenarios with unequal carrier rates while mixing modes are supported and the application scenarios with unequal carrier filter orders. The embodiment also supports each stage of filters of the supporting mode-mixing system to share one set of multiply-adding and cached resources to unify the dispatching of resources in one resource pool and maximize the utilization of resources, and supports the parameterized configuration of the links forward-backward stages, link parameter, carrier rate, and so on. | 07-22-2010 |
20100223431 | MEMORY ACCESS CONTROL SYSTEM, MEMORY ACCESS CONTROL METHOD, AND PROGRAM THEREOF - In a multi-core processor of a shared-memory type, deterioration in the data processing capability caused by competitions of memory accesses from a plurality of processors is suppressed effectively. In a memory access controlling system for controlling accesses to a cache memory in a data read-ahead process when the multi-core processor of a shared-memory type performs a task including a data read-ahead thread for executing data read-ahead and a parallel execution thread for performing an execution process in parallel with the data read-ahead, the system includes a data read-ahead controller which controls an interval between data read-ahead processes in the data read-ahead thread adaptive to a data flow which varies corresponding to an input value of the parallel process in the parallel execution thread. By controlling the interval between the data read-ahead processes, competitions of memory accesses in the multi-core processor are suppressed. | 09-02-2010 |
20100235581 | Cooperative Caching Technique - A method of caching data in a global cache distributed amongst a plurality of computing devices, comprising providing a global cache for caching data accessible to interconnected client devices, where each client contributes a portion of its main memory to the global cache. Each client also maintains an ordering of data that it has in its cache portion. When a remote reference for a cached datum is made, both the supplying client and the requesting client adjust their orderings to reflect the fact that the number of copies of the requested datum now likely exist in the global cache. | 09-16-2010 |
20100241809 | PROCESSOR, SERVER SYSTEM, AND METHOD FOR ADDING A PROCESSOR - A processor according to an exemplary of the invention includes a first initialization unit which reads a first program for checking a reliability of the processor into a cache memory and executes the first program when the processor is started up, and a second initialization unit which reads a second program for checking a reliability of the cache memory into a predetermined memory area and executes the second program when the second initialization unit receives a notification indicating the completion of the establishment of a communication path between the predetermined memory area and the processor from another processor which exists in a partition in which the processor is added. | 09-23-2010 |
20100241810 | Method and System for Dynamic Distributed Data Caching - A method and system for dynamic distributed data caching is presented. The method includes providing a cache community ( | 09-23-2010 |
20100268890 | INFORMATION HANDLING SYSTEM WITH IMMEDIATE SCHEDULING OF LOAD OPERATIONS IN A DUAL-BANK CACHE WITH SINGLE DISPATCH INTO WRITE/READ DATA FLOW - An information handling system (IHS) includes a processor with a cache memory system. The processor includes a processor core with an L1 cache memory that couples to an L2 cache memory. The processor includes an arbitration mechanism that controls load and store requests to the L2 cache memory. The arbitration mechanism includes control logic that enables a load request to interrupt a store request that the L2 cache memory is currently servicing. The L2 cache memory includes dual data banks so that one bank may perform a load operation while the other bank performs a store operation. The cache system provides a single dispatch point into the data flow to the dual cache banks of the L2 cache memory. | 10-21-2010 |
20100268891 | Allocation of memory space to individual processor cores - Techniques are generally described for a multi-core processor with a plurality of processor cores. At least one cache is accessible to at least two of the plurality of processor cores. The multi-core processor can be configured for separately allocating a memory space within the cache to the individual processor cores accessing the cache. | 10-21-2010 |
20100274973 | DATA REORGANIZATION IN NON-UNIFORM CACHE ACCESS CACHES - Embodiments that dynamically reorganize data of cache lines in non-uniform cache access (NUCA) caches are contemplated. Various embodiments comprise a computing device, having one or more processors coupled with one or more NUCA cache elements. The NUCA cache elements may comprise one or more banks of cache memory, wherein ways of the cache are horizontally distributed across multiple banks. To improve access latency of the data by the processors, the computing devices may dynamically propagate cache lines into banks closer to the processors using the cache lines. To accomplish such dynamic reorganization, embodiments may maintain “direction” bits for cache lines. The direction bits may indicate to which processor the data should be moved. Further, embodiments may use the direction bits to make cache line movement decisions. | 10-28-2010 |
20100281220 | Predictive ownership control of shared memory computing system data - A method, circuit arrangement, and design structure utilize a lock prediction data structure to control ownership of a cache line in a shared memory computing system. In a first node among the plurality of nodes, lock prediction data in a hardware-based lock prediction data structure for a cache line associated with a first memory request is updated in response to that first memory request, wherein at least a portion of the lock prediction data is predictive of whether the cache line is associated with a release operation. The lock prediction data is then accessed in response to a second memory request associated with the cache line and issued by a second node and a determination is made as to whether to transfer ownership of the cache line from the first node to the second node based at least in part on the accessed lock prediction data. | 11-04-2010 |
20100332763 | APPARATUS, SYSTEM, AND METHOD FOR CACHE COHERENCY ELIMINATION - An apparatus, system, and method are disclosed for improving cache coherency processing. The method includes determining that a first processor in a multiprocessor system receives a cache miss. The method also includes determining whether an application associated with the cache miss is running on a single processor core and/or whether the application is running on two or more processor cores that share a cache. A cache coherency algorithm is executed in response to determining that the application associated with the cache miss is running on two or more processor cores that do not share a cache, and is skipped in response to determining that the application associated with the cache miss is running on one of a single processor core and two or more processor cores that share a cache. | 12-30-2010 |
20110004729 | Block Caching for Cache-Coherent Distributed Shared Memory - Methods, apparatuses, and systems directed to the caching of blocks of lines of memory in a cache-coherent, distributed shared memory system. Block caches used in conjunction with line caches can be used to store more data with less tag memory space compared to the use of line caches alone and can therefore reduce memory requirements. In one particular embodiment, the present invention manages this caching using a DSM-management chip, after the allocation of the blocks by software, such as a hypervisor. An example embodiment provides processing relating to block caches in cache-coherent distributed shared memory. | 01-06-2011 |
20110029736 | STORAGE CONTROLLER AND METHOD OF CONTROLLING STORAGE CONTROLLER - The storage controller of the present invention is able to reduce the amount of purge message communication and increase the processing performance of the storage controller. Each microprocessor creates and saves a purge message every time control information in the shared memory is updated. After a series of update processes are complete, the saved purge messages are transmitted to each microprocessor. To the control information, attribute corresponding to its characteristics is established, and cache control and purge control are executed depending on the attribute. | 02-03-2011 |
20110055487 | OPTIMIZING MEMORY COPY ROUTINE SELECTION FOR MESSAGE PASSING IN A MULTICORE ARCHITECTURE - In one embodiment, the present invention includes a method to obtain topology information regarding a system including at least one multicore processor, provide the topology information to a plurality of parallel processes, generate a topological map based on the topology information, access the topological map to determine a topological relationship between a sender process and a receiver process, and select a given memory copy routine to pass a message from the sender process to the receiver process based at least in part on the topological relationship. Other embodiments are described and claimed. | 03-03-2011 |
20110072217 | Distributed Consistent Grid of In-Memory Database Caches - A plurality of mid-tier databases form a single, consistent cache grid for data in a one or more backend data sources, such as a database system. The mid-tier databases may be standard relational databases. Cache agents at each mid-tier database swap in data from the backend database as needed. Consistency in the cache grid is maintained by ownership locks. Cache agents prevent database operations that will modify cached data in a mid-tier database unless and until ownership of the cached data can be acquired for the mid-tier database. Cache groups define what backend data may be cached, as well as a general structure in which the backend data is to be cached. Metadata for cache groups is shared to ensure that data is cached in the same form throughout the entire grid. Ownership of cached data can then be tracked through a mapping of cached instances of data to particular mid-tier databases. | 03-24-2011 |
20110113199 | PREFETCH OPTIMIZATION IN SHARED RESOURCE MULTI-CORE SYSTEMS - An apparatus and method is described herein for optimization to prefetch throttling, which potentially enhances performance, reduces power consumption, and maintains positive gain for workloads that benefit from prefetching. More specifically, the optimizations described herein allow for bandwidth congestion and prefetch accuracy to be taken into account as feedbacks for throttling at the source of prefetch generation. As a result, when there is low congestion, full prefetch generation is allowed, even if the prefetch is inaccurate, since there is available bandwidth. However, when congestion is high, the determination of throttling falls to prefetch accuracy. If accuracy is high—miss rate is low—then less throttling is needed, because the prefetches are being utilized—performance is being enhanced. Yet, if prefetch accuracy is low—miss rate is high—then more prefetch throttling is needed to save power, because the prefetch are not being utilized—performance is not being enhanced by the large number of prefetches. | 05-12-2011 |
20110113200 | METHODS AND APPARATUSES FOR CONTROLLING CACHE OCCUPANCY RATES - Embodiments of an apparatus for controlling cache occupancy rates are presented. In one embodiment, an apparatus comprises a controller and monitor logic. The monitor logic determines a monitored occupancy rate associated with a first program class. The first controller regulates a first allocation probability corresponding to the first program class, based at least on the difference between a requested occupancy rate and the first monitored occupancy rate. | 05-12-2011 |
20110119447 | METHOD AND APPARATUS FOR MANAGING MEMORY IN A MOBILE ELECTRONIC DEVICE - According to embodiments described in the specification, a method and apparatus for managing memory in a mobile electronic device are provided. The method comprises: receiving a request to install an application; receiving at least one indication of data intended to be maintained in a shared cache; determining, based on the at least one indication, whether data corresponding to the intended data exists in the shared cache; upon a negative determination, writing the intended data to the shared cache; and repeating the receiving at least one indication, the determining and the writing for at least one additional application. | 05-19-2011 |
20110125971 | Shared Upper Level Cache Architecture - Various implementations of shared upper level cache architectures are disclosed. | 05-26-2011 |
20110138128 | Technique for tracking shared data in a multi-core processor or multi-processor system - A technique to track shared information in a multi-core processor or multi-processor system. In one embodiment, core identification information (“core IDs”) are used to track shared information among multiple cores in a multi-core processor or multiple processors in a multi-processor system. | 06-09-2011 |
20110145505 | Assigning Cache Priorities to Virtual/Logical Processors and Partitioning a Cache According to Such Priorities - Mechanisms are provided, for implementation in a data processing system having at least one physical processor and at least one associated cache memory, for allocating cache resources of the at least one cache memory to virtual processors of the data processing system. The mechanisms identify a plurality of high priority virtual processors in the data processing system. The mechanisms further determine a percentage of cache lines of the at least one cache memory to be assigned to high priority virtual processors. Moreover, the mechanisms mark a portion of the cache lines in the at least one cache memory as being evictable by only high priority virtual processors based on the determined percentage of cache lines to be assigned to high priority virtual processors. The marked portion of the cache lines cannot be evicted by lower priority virtual processors having a priority lower than the high priority virtual processors. | 06-16-2011 |
20110145506 | Replacing Cache Lines In A Cache Memory - In one embodiment, the present invention includes a cache memory including cache lines that each have a tag field including a state portion to store a cache coherency state of data stored in the line and a weight portion to store a weight corresponding to a relative importance of the data. In various implementations, the weight can be based on the cache coherency state and a recency of usage of the data. Other embodiments are described and claimed. | 06-16-2011 |
20110153948 | SYSTEMS, METHODS, AND APPARATUS FOR MONITORING SYNCHRONIZATION IN A DISTRIBUTED CACHE - Systems, apparatus, and method of monitoring synchronization in a distributed cache are described. In an exemplary embodiment, a first and second processing core process a first and second thread respectively. A first and second distributed cache slices store data for either or both of the first and second processing cores. A first and second core interface co-located with the first and second processing cores respectively maintain a finite state machine (FSM) to be executed in response to receiving a request from a thread of its co-located processing core to monitor a cache line in the distributed cache. | 06-23-2011 |
20110161596 | DIRECTORY-BASED COHERENCE CACHING - Techniques are generally described for methods, systems, data processing devices and computer readable media related to multi-core parallel processing directory-based cache coherence. Example systems may include one multi-core processor or multiple multi-core processors. An example multi-core processor includes a plurality of processor cores, each of the processor cores having a respective cache. The system may further include a main memory coupled to each multi-core processor. A directory descriptor cache may be associated with the plurality of the processor cores, where the directory descriptor cache may be configured to store a plurality of directory descriptors. Each of the directory descriptors may provide an indication of the cache sharing status of a respective cache-line-sized row of the main memory. | 06-30-2011 |
20110191542 | SYSTEM-WIDE QUIESCENCE AND PER-THREAD TRANSACTION FENCE IN A DISTRIBUTED CACHING AGENT - Methods and apparatus relating to system-wide quiescence and per-thread transaction fence in a distributed caching agent are described. Some embodiments utilize messages, counters, and/or state machines that support system-wide quiescence and per-thread transaction fence flows. Other embodiments are also disclosed. | 08-04-2011 |
20110197031 | Update Handler For Multi-Channel Cache - Disclosed herein is a miss handler for a multi-channel cache memory, and a method that includes determining a need to update a multi-channel cache memory due at least to one of an occurrence of a cache miss or a data prefetch being needed. The method further includes operating a multi-channel cache miss handler to update at least one cache channel storage of the multi-channel cache memory from a main memory. | 08-11-2011 |
20110208916 | SHARED CACHE CONTROLLER, SHARED CACHE CONTROL METHOD AND INTEGRATED CIRCUIT - A monitoring section | 08-25-2011 |
20110219191 | READER SET ENCODING FOR DIRECTORY OF SHARED CACHE MEMORY IN MULTIPROCESSOR SYSTEM - In a parallel processing system with speculative execution, conflict checking occurs in a directory lookup of a cache memory that is shared by all processors. In each case, the same physical memory address will map to the same set of that cache, no matter which processor originated that access. The directory includes a dynamic reader set encoding, indicating what speculative threads have read a particular line. This reader set encoding is used in conflict checking. A bitset encoding is used to specify particular threads that have read the line. | 09-08-2011 |
20110246721 | METHOD AND APPARATUS FOR PROVIDING AUTOMATIC SYNCHRONIZATION APPLIANCE - A method and apparatus for data backup are disclosed. Embodiments of the method comprise receiving a set of data from a local computer, caching the received data locally on the storage appliance in a buffer module, uploading the cached data to a remote computer, and accessing the set of data using the storage device. Embodiments of the apparatus comprise a network interface module for establishing connection of the storage appliance with at least one computer in a local network and at least one remote computer in a cloud network, a buffer module for receiving data to be backed up from the at least one computer; and a processor. | 10-06-2011 |
20110296113 | RECOVERY IN SHARED MEMORY ENVIRONMENT - A method, system, and computer usable program product for recovery in a shared memory environment are provided in the illustrative embodiments. A core in a multi-core processor is designated as a user level core (ULC), which executes an instruction to modify a memory while executing an application. A second core is designated as a operating system core (OSC), which manages checkpointing of several segments of the shared memory. A set of flags is accessible to a memory controller to manage a shared memory. A flag in the set of flags corresponds to one segment in the segments of the shared memory. A message or instruction for modification of a segment is received. A cache line tracking determination is made whether a cache line used for the modification has already been used for a similar modification. If not, a part of the segment is checkpointed. The modification proceeds after checkpointing. | 12-01-2011 |
20110307665 | PERSISTENT MEMORY FOR PROCESSOR MAIN MEMORY - Subject matter disclosed herein relates to a system of one or more processors that includes persistent memory. | 12-15-2011 |
20110314227 | Horizontal Cache Persistence In A Multi-Compute Node, Symmetric Multiprocessing Computer - Horizontal cache persistence in a multi-compute node, SMP computer, including, responsive to a determination to evict a cache line on a first one of the compute nodes, broadcasting by a first compute node an eviction notice for the cache line; transmitting the state of the cache line receiving compute nodes, including, if the cache line is missing from a compute node, an indication whether that compute node has cache storage space available for the cache line; determining by the first compute node, according to the states of the cache line and space available, whether the first compute node can evict the cache line without writing the cache line to main memory; and updating by each compute node the state of the cache line in each compute node, in dependence upon one or more of the states of the cache line in all the compute nodes. | 12-22-2011 |
20110320726 | STORAGE APPARATUS AND METHOD FOR CONTROLLING STORAGE APPARATUS - A storage apparatus has a channel board | 12-29-2011 |
20110320727 | DYNAMIC CACHE QUEUE ALLOCATION BASED ON DESTINATION AVAILABILITY - An apparatus for controlling operation of a cache includes a first command queue, a second command queue and an input controller configured to receive requests having a first command type and a second command type and to assign a first request having the first command type to the first command queue and a second command having the first command type to the second command queue in the event that the first command queue has not received an indication that a first dedicated buffer is available. | 12-29-2011 |
20110320728 | PERFORMANCE OPTIMIZATION AND DYNAMIC RESOURCE RESERVATION FOR GUARANTEED COHERENCY UPDATES IN A MULTI-LEVEL CACHE HIERARCHY - A cache includes a cache pipeline, a request receiver configured to receive off chip coherency requests from an off chip cache and a plurality of state machines coupled to the request receiver. The cache also includes an arbiter coupled between the plurality of state machines and the cache pipe line and is configured to give priority to off chip coherency requests as well as a counter configured to count the number of coherency requests sent from the cache pipeline to a lower level cache. The cache pipeline is halted from sending coherency requests when the counter exceeds a predetermined limit. | 12-29-2011 |
20110320729 | CACHE BANK MODELING WITH VARIABLE ACCESS AND BUSY TIMES - Various embodiments of the present invention manage access to a cache memory. In one embodiment, a set of cache bank availability vectors are generated based on a current set of cache access requests currently operating on a set of cache banks and at least a variable busy time of a cache memory includes the set of cache banks. The set of cache bank availability vectors indicate an availability of the set of cache banks. A set of cache access requests for accessing a set of given cache banks within the set of cache banks is received. At least one cache access request in the set of cache access requests is selected to access a given cache bank based on the a cache bank availability vectors associated with the given cache bank and the set of access request parameters associated with the at least one cache access that has been selected. | 12-29-2011 |
20120005430 | STORAGE SYSTEM AND OWNERSHIP CONTROL METHOD FOR STORAGE SYSTEM - Access to various types of resources is controlled efficiently, thereby enhancing the throughput. A storage system includes: a disk device for providing a volume for storing data to a host system; a channel adapter for writing data from the host system to the disk device via a cache memory; a disk adapter for transferring data to and from the disk device; and at least one processor package including a plurality of processors for controlling the channel adapter and the disk adapter; wherein any one of the processor packages includes a processor for incorporatively transferring related types of ownership based on specific control information for managing the plurality of types of ownership for each of the plurality of types of resources. | 01-05-2012 |
20120005431 | Network with Distributed Shared Memory - A computer network with distributed shared memory, including a clustered memory cache aggregated from and comprised of physical memory locations on a plurality of physically distinct computing systems. The clustered memory cache is accessible by a plurality of clients on the computer network and is configured to perform page caching of data items accessed by the clients. The network also includes a policy engine operatively coupled with the clustered memory cache, where the policy engine is configured to control where data items are cached in the clustered memory cache. | 01-05-2012 |
20120023295 | HYBRID ADDRESS MUTEX MECHANISM FOR MEMORY ACCESSES IN A NETWORK PROCESSOR - Described embodiments provide arbitration for a cache of a network processor. Processing modules of the network processor generate memory access requests including a requested address and an ID value corresponding to the requesting processing module. Each request is either a locked request or a simple request. An arbiter determines whether the received requests are locked requests. For each locked request, the arbiter determines whether two or more of the requests are conflicted based on the requested address of each received memory requests. If one or more of the requests are non-conflicted, the arbiter determines, for each non-conflicted request, whether the requested addresses are locked out by prior memory requests based on a lock table. If one or more of the non-conflicted memory requests are locked-out by prior memory requests, the arbiter queues the locked-out memory requests. The arbiter grants any non-conflicted memory access requests that are not locked-out. | 01-26-2012 |
20120079204 | Cache with Multiple Access Pipelines - Parallel pipelines are used to access a shared memory. The shared memory is accessed via a first pipeline by a processor to access cached data from the shared memory. The shared memory is accessed via a second pipeline by a memory access unit to access the shared memory. A first set of tags is maintained for use by the first pipeline to control access to the cache memory, while a second set of tags is maintained for use by the second pipeline to access the shared memory. Arbitrating for access to the cache memory for a transaction request in the first pipeline and for a transaction request in the second pipeline is performed after each pipeline has checked its respective set of tags. | 03-29-2012 |
20120110268 | DATA PROCESSING APPARATUS AND DATA PROCESSING METHOD - The data processing apparatus according to an embodiment of the present invention includes: a first processor; a second processor; and an external RAM to/from which the first processor writes/reads data, the first processor including a cache memory for storing data used in the first processor in association with an address on the external RAM, and the data being written to the cache memory by the second processor not through the external RAM. | 05-03-2012 |
20120137078 | Multiple Critical Word Bypassing in a Memory Controller - In one embodiment, a memory controller may be configured to transmit two or more critical words (or beats) corresponding to two or more different read requests prior to returning the remaining beats of the read requests. Such an embodiment may reduce latency to the sources of the memory requests, which may be stalled awaiting the critical words. The remaining words may fill a cache block or other buffer, but may not be required by the sources as quickly as the critical words in order to support higher performance. In some embodiments, once a remaining beat of a block is transmitted, all of the remaining beats may be transmitted contiguously. In other embodiments, additional critical words may be forwarded between remaining beats of a block. | 05-31-2012 |
20120144122 | METHOD AND APPARATUS FOR ACCELERATED SHARED DATA MIGRATION - A method and apparatus for accelerated shared data migration between cores is disclosed. | 06-07-2012 |
20120159077 | METHOD AND APPARATUS FOR OPTIMIZING THE USAGE OF CACHE MEMORIES - A method and apparatus to reduce unnecessary write backs of cached data to a main memory and to optimize the usage of a cache memory tag directory. In one embodiment of the invention, the power consumption of a processor can be saved by eliminating write backs of cache memory lines that has information that has reached its end-of-life. In one embodiment of the invention, when a processing unit is required to clear one or more cache memory lines, it uses a write-zero command to clear the one or more cache memory lines. The processing unit does not perform a write operation to move or pass data values of zero to the one or more cache memory lines. By doing so, it reduces the power consumption of the processing unit. | 06-21-2012 |
20120166731 | COMPUTING PLATFORM POWER MANAGEMENT WITH ADAPTIVE CACHE FLUSH - In some embodiments, an adaptive break-even time, based on the load level of the cache, may be employed. | 06-28-2012 |
20120210071 | Remote Core Operations In A Multi-Core Computer - A multi-core processor with a shared physical memory is described. In an embodiment a sending core sends a memory write request to a destination core so that the request may be acted upon by the destination core as if it originated from the destination core. In an example, a data structure is configured in the shared physical memory and mapped to be accessible to the sending and destination cores. In an example, the shared data structure is used as a message channel between the sending and destination cores to carry data using the memory write request. In an embodiment a notification mechanism is enabled using the shared physical memory in order to notify the destination core of events by updating a notification data structure. In an example, the notification mechanism triggers a notification process at the destination core to inform a receiving process of a notification. | 08-16-2012 |
20120215984 | HETEROGENEOUS PROCESSORS SHARING A COMMON CACHE - A multi-core processor providing heterogeneous processor cores and a shared cache is presented. | 08-23-2012 |
20120221795 | SHARED MEMORY SYSTEM AND CONTROL METHOD THEREFOR - A shared memory system provides an access monitoring mechanism | 08-30-2012 |
20120226870 | RECOVERY IN SHARED MEMORY ENVIRONMENT - A method for recovery in a shared memory environment is provided in the illustrative embodiments. A core in a multi-core processor is designated as a user level core (ULC), which executes an instruction to modify a memory while executing an application. A second core is designated as a operating system core (OSC), which manages checkpointing of several segments of the shared memory. A set of flags is accessible to a memory controller to manage a shared memory. A flag in the set of flags corresponds to one segment in the segments of the shared memory. A message or instruction for modification of a segment is received. A cache line tracking determination is made whether a cache line used for the modification has already been used for a similar modification. If not, a part of the segment is checkpointed. The modification proceeds after checkpointing. | 09-06-2012 |
20120254546 | USING A MIGRATION CACHE TO CACHE TRACKS DURING MIGRATION - Provided are a method, system, and computer program product for using a migration cache to cache tracks during migration. In response to a migration operation, a determination is made of a first set of tracks in the source storage indicated in an extent list and of a second set of tracks in the extent. The tracks in the source storage in the first set are copied to a migration cache. The tracks in the second set are copied directly from the source storage to the destination storage without buffering in the migration cache. The tracks in the first set are copied from the migration cache to the destination storage. The migration operation is completed in response to copying the first set of tracks from the migration cache to the destination storage and copying the second set of tracks from the source storage to the destination storage. | 10-04-2012 |
20120265939 | Cache memory structure and method - The invention relates to a cache memory and method for controlling access to data. According to the invention, a control area which is advantageously formed separate from a data area is provided for controlling the access to data stored in the cache and to be read by applicative processes. The control area includes at least one release area with offsets and data version definition sections. | 10-18-2012 |
20120265940 | TRANSACTIONAL PROCESSING FOR CLUSTERED FILE SYSTEMS - Systems and methods for transactional processing within a clustered file system wherein user defined transactions operate on data segments of the file system data. The users are provided within an interface for using a transactional mechanism, namely services for opening, writing and rolling-back transactions. A distributed shared memory technology is utilized to facilitate efficient and coherent cache management within the clustered file system based on the granularity of data segments (rather than files). | 10-18-2012 |
20120290794 | REQUEST TO OWN CHAINING IN MULTI-SOCKETED SYSTEMS - A method including: receiving multiple local requests to access the cache line; inserting, into an address chain, multiple entries corresponding to the multiple local requests; identifying a first entry at a head of the address chain; initiating, in response to identifying the first entry and in response to the first entry corresponding to a request to own the cache line, a traversal of the address chain; setting, during the traversal of the address chain, a state element identified in a second entry; receiving a foreign request to access the cache line; inserting, in response to setting the state element, a third entry corresponding to the foreign request into the address chain after the second entry; and relinquishing, in response to inserting the third entry after the second entry in the address chain, the cache line to a foreign thread after executing the multiple local requests. | 11-15-2012 |
20120317362 | SYSTEMS, METHODS, AND DEVICES FOR CACHE BLOCK COHERENCE - Systems, methods, and devices for efficient cache coherence between memory-sharing devices are provided. In particular, snoop traffic may be suppressed based at least partly on a table of block tracking entries (BTEs). Each BTE may indicate whether groups of one or more cache lines of a block of memory could potentially be in use by another memory-sharing device. By way of example, a memory-sharing device may employ a table of BTEs that each has several cache status entries. When a cache status entry indicates that none of a group of one or more cache lines could possibly be in use by another memory-sharing device, a snoop request for any cache lines of that group may be suppressed without jeopardizing cache coherence. | 12-13-2012 |
20120317363 | Memory Caching for Browser Processes - There is disclosed a method in which a process is initiated to handle a set of information, which includes one or more resources. In the method the set of information is examined to determine whether the set of information includes a resource stored as a shareable cache element in a memory. If the determination indicates that the set of information includes a resource stored as a shareable cache element, the shareable cache element is used as the resource of the set of information. | 12-13-2012 |
20120331232 | WRITE-THROUGH CACHE OPTIMIZED FOR DEPENDENCE-FREE PARALLEL REGIONS - An apparatus and computer program product for improving performance of a parallel computing system. A first hardware local cache controller associated with a first local cache memory device of a first processor detects an occurrence of a false sharing of a first cache line by a second processor running the program code and allows the false sharing of the first cache line by the second processor. The false sharing of the first cache line occurs upon updating a first portion of the first cache line in the first local cache memory device by the first hardware local cache controller and subsequent updating a second portion of the first cache line in a second local cache memory device by a second hardware local cache controller. | 12-27-2012 |
20130013864 | MEMORY ACCESS MONITOR - For each access request received at a shared cache of the data processing device, a memory access pattern (MAP) monitor predicts which of the memory banks, and corresponding row buffers, would be accessed by the access request if the requesting thread were the only thread executing at the data processing device. By recording predicted accesses over time for a number of access requests, the MAP monitor develops a pattern of predicted memory accesses by executing threads. The pattern can be employed to assign resources at the shared cache, thereby managing memory more efficiently. | 01-10-2013 |
20130019064 | POWER REDUCTION USING UNMODIFIED INFORMATION IN EVICTED CACHE LINESAANM Kumashikar; Mahesh K.AACI BangaloreAACO INAAGP Kumashikar; Mahesh K. Bangalore INAANM Jagannathan; AshokAACI BangaloreAACO INAAGP Jagannathan; Ashok Bangalore IN - Embodiments of the present disclosure describe techniques and configurations to reduce power consumption using unmodified information in evicted cache lines. A method includes identifying unmodified information of a cache line stored in a cache of a processor, tracking the unmodified information using a bit vector comprising one or more bits to indicate the unmodified information of the cache line, and selectively suppressing a write operation or send operation for the unmodified information of the cache line that is evicted from the cache to an input/output (I/O) component coupled to the cache, the selective suppressing being based on the one or more bits, and the I/O component being an outer component external to the cache. Other embodiments may be described and/or claimed. | 01-17-2013 |
20130031311 | INTERFACE APPARATUS, CALCULATION PROCESSING APPARATUS, INTERFACE GENERATION APPARATUS, AND CIRCUIT GENERATION APPARATUS - There is provided is an interface apparatus including: a stream converter receiving write-addresses and write-data, storing the received data in a buffer, and sorting the stored write-data in the order of the write-addresses to output the write-data as stream-data; a cache memory storing received stream-data if a load-signal indicates that the stream-data are necessarily loaded and outputting data stored in a storage device corresponding to an input cache-address as cache-data; a controller determining whether or not data allocated with a read-address have already been loaded, outputting the load-signal instructing the loading on the cache memory if not loaded, and outputting a load-address indicating a load-completed-address of the cache memory; and at least one address converter calculating which one of the storage devices the allocated data are stored in, by using the load-address, outputting the calculated value as the cache-address to the cache memory, and outputting the cache-data as read-data. | 01-31-2013 |
20130042070 | Shared cache memory control - A data processing system | 02-14-2013 |
20130042071 | Video Object Placement for Cooperative Caching - A method, an apparatus and an article of manufacture for placing at least one object at at least one cache of a set of cooperating caching nodes with limited inter-node communication bandwidth. The method includes transmitting information from the set of cooperating caching nodes regarding object accesses to a placement computation component, determining object popularity distribution based on the object access information, and instructing the set of cooperating caching nodes of at least one object to cache, the at least one node at which each object is to be cached, and a manner in which the at least one cached object is to be shared among the at least one caching node based on the object popularity distribution and cache and object sizes such that a cumulative hit rate at the at least one cache is increased while a constraint on inter-node communication bandwidth is not violated. | 02-14-2013 |
20130042072 | ELECTRONIC SYSTEM AND METHOD FOR SELECTIVELY ALLOWING ACCESS TO A SHARED MEMORY - An electronic system, an integrated circuit and a method for display are disclosed. The electronic system contains a first device, a memory and a video/audio compression/decompression device such as a decoder/encoder. The electronic system is configured to allow the first device and the video/audio compression/decompression device to share the memory. The electronic system may be included in a computer in which case the memory is a main memory. Memory access is accomplished by one or more memory interfaces, direct coupling of the memory to a bus, or direct coupling of the first device and decoder/encoder to a bus. An arbiter selectively provides access for the first device and/or the decoder/encoder to the memory based on priority. The arbiter may be monolithically integrated into a memory interface. The decoder may be a video decoder configured to comply with the MPEG-2 standard. The memory may store predicted images obtained from a preceding image. | 02-14-2013 |
20130091330 | Early Cache Eviction in a Multi-Flow Network Processor Architecture - Described embodiments provide an input/output interface of a network processor that generates a request to store received packets to a system cache. If an entry associated with the received packet does not exist in the system cache, the system cache determines whether a backpressure indicator of the system cache is set. If the backpressure indicator is set, the received packet is written to the shared memory. If the backpressure indicator is not set, the system cache determines whether to evict data from the system cache in order to store the received packet. If an eviction rate of the system cache has reached a threshold, the system cache sets a backpressure indicator and writes the received packet to the shared memory. If the eviction rate has not reached the threshold, the system cache determines an available entry and writes the received packet to the available entry in the system cache. | 04-11-2013 |
20130103905 | Optimizing Memory Copy Routine Selection For Message Passing In A Multicore Architecture - In one embodiment, the present invention includes a method to obtain topology information regarding a system including at least one multicore processor, provide the topology information to a plurality of parallel processes, generate a topological map based on the topology information, access the topological map to determine a topological relationship between a sender process and a receiver process, and select a given memory copy routine to pass a message from the sender process to the receiver process based at least in part on the topological relationship. Other embodiments are described and claimed. | 04-25-2013 |
20130111141 | MULTI-CORE INTERCONNECT IN A NETWORK PROCESSOR | 05-02-2013 |
20130111142 | METHOD FOR ACCESSING CACHE AND PSEUDO CACHE AGENT | 05-02-2013 |
20130132678 | INFORMATION PROCESSING SYSTEM - An information processing system has a plurality of nodes which use a snoop cache memory in each of the plurality of nodes. A directory, which maintains a cache coherence of the snoop cache memory of the plurality of nodes, has a first directory and a second directory which has a different format from a format of the first directory and is only used for a shared state. The node searches the first and second directories, and determines the other node to transmit a snoop. | 05-23-2013 |
20130132679 | STORAGE SYSTEM, CONTROL PROGRAM AND STORAGE SYSTEM CONTROL METHOD - There is provided a storage system including one or more LDEVs, one or more processors, a local memory or memories corresponding to the processor or processors, and a shared memory, which is shared by the processors, wherein control information on I/O processing or application processing is stored in the shared memory, and the processor caches a part of the control information in different storage areas on a type-by-type basis in the local memory or memories corresponding to the processor or processors in referring to the control information stored in the shared memory. | 05-23-2013 |
20130151782 | Providing Common Caching Agent For Core And Integrated Input/Output (IO) Module - In one embodiment, the present invention includes a multicore processor having a plurality of cores, a shared cache memory, an integrated input/output (IIO) module to interface between the multicore processor and at least one IO device coupled to the multicore processor, and a caching agent to perform cache coherency operations for the plurality of cores and the IIO module. Other embodiments are described and claimed. | 06-13-2013 |
20130151783 | INTERFACE AND METHOD FOR INTER-THREAD COMMUNICATION - The interface for inter-thread communication between a plurality of threads including a number of producer threads for producing data objects and a number of consumer threads for consuming the produced data objects includes a specifier and a provider. The specifier is configured to specify a certain relationship between a certain producer thread of the number of producer threads which is adapted to produce a certain data object and a consumer thread of the number of consumer threads which is adapted to consume the produced certain data object. Further, the provider is configured to provide direct cache line injection of a cache line of the produced certain data object to a cache allocated to the certain consumer thread related to the certain producer thread by the specified certain relationship. | 06-13-2013 |
20130198459 | SYSTEMS AND METHODS FOR A DE-DUPLICATION CACHE - A de-duplication is configured to cache data for access by a plurality of different storage clients, such as virtual machines. A virtual machine may comprise a virtual machine de-duplication module configured to identify data for admission into the de-duplication cache. Data admitted into the de-duplication cache may be accessible by two or more storage clients. Metadata pertaining to the contents of the de-duplication cache may be persisted and/or transferred with respective storage clients such that the storage clients may access the contents of the de-duplication cache after rebooting, being power cycled, and/or being transferred between hosts. | 08-01-2013 |
20130205091 | MULTI-BANK CACHE MEMORY - In general, this disclosure describes techniques for increasing the throughput of multi-bank cache memory systems accessible by multiple clients. Requests for data from a client may be stored in a pending buffer associated with the client for a first cache memory bank. For each of the requests for data, a determination may be made as to if the request is able to be fulfilled by a cache memory within the first cache memory bank regardless of a status of requests by the client for data at a second cache memory bank. Data requested from the cache memory by the client may be stored in a read data buffer associated with the client according to an order of receipt of the requests for data in the pending buffer. | 08-08-2013 |
20130205092 | MULTICORE COMPUTER SYSTEM WITH CACHE USE BASED ADAPTIVE SCHEDULING - An example multicore environment generally described herein may be adapted to improve use of a shared cache by a plurality of processing cores in a multicore processor. For example, where a producer task associated with a first core of the multicore processor places data in a shared cache at a faster rate than a consumer task associated with a second core of the multicore processor, relative task execution rates can be adapted to prevent eventual increased cache misses by the consumer task. | 08-08-2013 |
20130212332 | SELECTIVELY READING DATA FROM CACHE AND PRIMARY STORAGE - Techniques are provided for using an intermediate cache to provide some of the items involved in a scan operation, while other items involved in the scan operation are provided from primary storage. Techniques are also provided for determining whether to service an I/O request for an item with a copy of the item that resides in the intermediate cache based on factors such as a) an identity of the user for whom the I/O request was submitted, b) an identity of a service that submitted the I/O request, c) an indication of a consumer group to which the I/O request maps, or d) whether the intermediate cache is overloaded. Techniques are also provided for determining whether to store items in an intermediate cache in response to the items being retrieved, based on logical characteristics associated with the requests that retrieve the items. | 08-15-2013 |
20130212333 | INFORMATION PROCESSING APPARATUS, METHOD OF CONTROLLING MEMORY, AND MEMORY CONTROLLING APPARATUS - An information processing apparatus provided with a plurality of nodes each including at least one processor, a system controller, and a main memory, includes a status storage unit that stores statuses of a plurality of cache lines and that is capable of reading statuses of a plurality of cache lines by one reading operation, a recording unit that is provided in a system controller in at least one node and that records all or part of the statuses stored in the status storage unit, wherein the system controller records obtained statuses in the recording unit on a condition that all of the statuses of the plurality of cache lines obtained by reading the status storage unit are invalid statuses or shared statuses in different nodes when the system controller has read the status storage unit in response to a request. | 08-15-2013 |
20130254488 | SYSTEM AND METHOD FOR SIMPLIFYING CACHE COHERENCE USING MULTIPLE WRITE POLICIES - System and methods for cache coherence in a multi-core processing environment having a local/shared cache hierarchy. The system includes multiple processor cores, a main memory, and a local cache memory associated with each core for storing cache lines accessible only by the associated core. Cache lines are classified as either private or shared. A shared cache memory is coupled to the local cache memories and main memory for storing cache lines. The cores follow a write-back to the local memory for private cache lines, and a write-through to the shared memory for shared cache lines. Shared cache lines in local cache memory enter a transient dirty state when written by the core. Shared cache lines transition from a transient dirty to a valid state with a self-initiated write-through to the shared memory. The write-through to shared memory can include only data that was modified in the transient dirty state. | 09-26-2013 |
20130282985 | MANAGING CONCURRENT ACCESSES TO A CACHE - Various embodiments of the present invention allow concurrent accesses to a cache. A request to update an object stored in a cache is received. A first data structure comprising a new value for the object is created in response to receiving the request. A cache pointer is atomically modified to point to the first data structure. A second data structure comprising an old value for the cached object is maintained until a process, which holds a pointer to the old value of the cached object, at least one of one of ends and indicates that the old value is no longer needed. | 10-24-2013 |
20130326146 | INFORMATION PROCESSING APPARATUS, MEMORY APPARATUS, AND DATA MANAGEMENT METHOD - An information processing apparatus that appropriately manages data of an auxiliary memory apparatus is provided to prevent data from leaking. The information processing apparatus includes a first memory apparatus, a second memory apparatus, and a caching unit. The caching unit stores write data to be written on the second memory apparatus in a cache area ensured on the first memory apparatus. When a first event occurs, the caching unit initializes a management information table, in which the address of the cache area in which the write data is stored is associated with the address of the second memory apparatus in which the write data is to be stored, and restores the second memory apparatus to a state pervious to a state in which data is written. | 12-05-2013 |
20130326147 | SHORT CIRCUIT OF PROBES IN A CHAIN - A multi-core processing apparatus may provide a cache probe and data retrieval method. The method may comprise sending a memory request from a requester to a record keeping structure. The memory request may have a memory address of a memory that stores requested data. The method may further comprise determining that a local last accessor of the memory address may have a copy of the requested data up to date with the memory. The local last accessor may be within a local domain that the requester belongs to. The method may further comprise sending a cache probe to the local last accessor and retrieving a latest value of the requested data from the local last accessor to the requester. | 12-05-2013 |
20130339614 | MITIGATING CONFLICTS FOR SHARED CACHE LINES - A computer program product for mitigating conflicts for shared cache lines between an owning core currently owning a cache line and a requestor core. The computer program product includes a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method. The method includes determining whether the owning core is operating in a transactional or non-transactional mode and setting a hardware-based reject threshold at a first or second value with the owning core determined to be operating in the transactional or non-transactional mode, respectively. The method further includes taking first or second actions to encourage cache line sharing between the owning core and the requestor core in response to a number of rejections of requests by the requestor core reaching the reject threshold set at the first or second value, respectively. | 12-19-2013 |
20130339615 | MANAGING TRANSACTIONAL AND NON-TRANSACTIONAL STORE OBSERVABILITY - Embodiments relate to controlling observability of transactional and non-transactional stores. An aspect includes receiving one or more store instructions. The one or more store instructions are initiated within an active transaction and include store data. The active transaction effectively delays committing stores to memory until successful completion of the active transaction. The store data is stored in a local storage buffer causing alterations to the local storage buffer from a first state to a second state. A signal is received that the active transaction has terminated. If the active transaction has terminated abnormally then: the local storage buffer is reverted back to the first state if the store data was stored by a transactional store instruction, and is propagated to a shared cache if the store instruction is non-transactional. | 12-19-2013 |
20130339616 | MANAGING TRANSACTIONAL AND NON-TRANSACTIONAL STORE OBSERVABILITY - Embodiments relate to controlling observability of transactional and non-transactional stores. An aspect includes receiving one or more store instructions. The one or more store instructions are initiated within an active transaction and include store data. The active transaction effectively delays committing stores to memory until successful completion of the active transaction. The store data is stored in a local storage buffer causing alterations to the local storage buffer from a first state to a second state. A signal is received that the active transaction has terminated. If the active transaction has terminated abnormally then: the local storage buffer is reverted back to the first state if the store data was stored by a transactional store instruction, and is propagated to a shared cache if the store instruction is non-transactional. | 12-19-2013 |
20140006716 | DATA CONTROL USING LAST ACCESSOR INFORMATION | 01-02-2014 |
20140025897 | METHOD AND SYSTEM FOR CACHE REPLACEMENT FOR SHARED MEMORY CACHES - A method for managing objects stored in a shared memory cache. The method includes accessing data from the shared memory cache using at least a plurality of cache readers. A system updates data in the shared memory cache using a cache writer. The system maintains a cache replacement process collocated with a cache writer. The cache replacement process makes a plurality of decisions on objects to store in the shared memory cache. Each of the plurality of cache readers maintains information on frequencies with which it accesses cached objects. Each of the plurality of cache readers communicates the maintained information to the cache replacement process. The cache replacement process uses the communicated information on frequencies to make at least one decision on replacing at least one object currently stored in the shared memory cache. | 01-23-2014 |
20140025898 | CACHE REPLACEMENT FOR SHARED MEMORY CACHES - An information processing system and computer program storage product for managing objects stored in a shared memory cache. The system includes at least a plurality of cache readers accessing data from the shared memory cache. The system updates data in the shared memory cache using a cache writer. The system maintains a cache replacement process collocated with a cache writer. The cache replacement process makes a plurality of decisions on objects to store in the shared memory cache. Each of the plurality of cache readers maintains information on frequencies with which it accesses cached objects. Each of the plurality of cache readers communicates the maintained information to the cache replacement process. The cache replacement process uses the communicated information on frequencies to make at least one decision on replacing at least one object currently stored in the shared memory cache. | 01-23-2014 |
20140032848 | Sharing Pattern-Based Directory Coherence for Multicore Scalability ("SPACE") - A method and directory system that recognizes and represents the subset of sharing patterns present in an application is provided. As used herein, the term sharing pattern refers to a group of processors accessing a single memory location in an application. The sharing pattern is decoupled from each cache line and held in a separate directory table. The sharing pattern of a cache block is the bit vector representing the processors that share the block. Multiple cache lines that have the same sharing pattern point to a common entry in the directory table. In addition, when the table capacity is exceeded, patterns that are similar to each other are dynamically collated into a single entry. | 01-30-2014 |
20140040554 | Protecting Large Regions without Operating-System Support - A system and method for providing very large read-sets for hardware transactional memory with limited hardware support by monitoring meta data such as page table entries. The system and method include a Hardware-based Transactional Memory (HTM) mechanism that tracks meta-data such as page-table entries (PTE) rather than all the data itself. The HTM mechanism protects large regions of memory by providing conflict detection so that regions of memory can be located within a local read or write set. | 02-06-2014 |
20140040555 | DATA PROCESSING, METHOD, DEVICE, AND SYSTEM FOR PROCESSING REQUESTS IN A MULTI-CORE SYSTEM - The present disclosure provides a method, device, and system for processing a request in a multi-core system. The method comprises steps of: receiving a request for data by a filter from a requesting unit; comparing an indicator indicative of a logical partition in the request with an indicator indicative of the logical partition in a record of the filter; searching in a unit where the filter is located based on the request and returning a search result to the requesting unit if a comparison result matches; and returning a NONE response to the requesting unit from the filter if the comparison result does not match. | 02-06-2014 |
20140040556 | Dynamic Multithreaded Cache Allocation - Apparatus and method embodiments for dynamically allocating cache space in a multi-threaded execution environment are disclosed. In some embodiments, a processor includes a cache shared by each of a plurality of processor cores and/or each of a plurality of threads executing on the processor. The processor further includes a cache allocation circuit configured to dynamically allocate space in the cache provided to each of the plurality of processor cores based on their respective usage patterns. The cache allocation unit may track cache usage by each of the processor cores/threads using subsets of usage bits and counters configured to update states of the usage bits. The cache allocation circuit may track the usage of cache space by the processor cores/threads and may allocate more space to those that exhibit more usage of the cache. | 02-06-2014 |
20140040557 | NESTED REWIND ONLY AND NON REWIND ONLY TRANSACTIONS IN A DATA PROCESSING SYSTEM SUPPORTING TRANSACTIONAL STORAGE ACCESSES - In a multiprocessor data processing system having a distributed shared memory system, first and second nested memory transactions are executed, where the first memory transaction is a rewind-only transaction (ROT) and the second memory transaction is a non-ROT memory transaction. The first memory transaction has a transaction body including the second memory transaction and an additional plurality of transactional memory access instructions. In response to execution of the transactional memory access instructions, memory accesses are performed to the distributed shared memory system. Conflicts between memory accesses not within the first memory transaction and at least a load footprint of any of the transactional memory access instructions preceding the second memory transaction are not tracked. However, conflicts between memory accesses not within the first memory transaction and store and load footprints of any of the transactional memory access instructions that follow initiation the second memory transaction are tracked. | 02-06-2014 |
20140040558 | INFORMATION PROCESSING APPARATUS, PARALLEL COMPUTER SYSTEM, AND CONTROL METHOD FOR ARITHMETIC PROCESSING UNIT - An information processing apparatus included in a parallel computer system has a memory that holds data and a processor including a cache memory that holds a part of the data held on the memory and a processor core that performs arithmetic operations using the data held on the memory or the cache memory. Moreover, the information processing apparatus has a communication device that determines whether data received from a different information processing apparatus is data that the processor core waits for. When the communication device determines that the received data is data that the processor core waits for, the communication device stores the received data on the cache memory. When the communication device determines that the received data is data that the processor core does not wait for, the communication device stores the received data on the memory. | 02-06-2014 |
20140052923 | PROCESSOR AND CONTROL METHOD FOR PROCESSOR - A processor includes a plurality of nodes arranged two dimensionally in the X-axis direction and in the Y-axis direction, and each of the nodes includes a processor core and a distributed shared cache memory. The processor also includes a first connecting unit and a second connecting unit. The first connecting unit connects adjacent nodes in the X-axis direction among the nodes, in a ring shape. The second connecting unit connects adjacent nodes in the Y-axis direction among the nodes, in a ring shape. The cache memories included in the respective nodes are divided into banks in the Y-axis direction. Coherency of the cache memories in the X-axis direction is controlled by a snoop system. The cache memories are shared by the nodes. | 02-20-2014 |
20140075121 | Selective Delaying of Write Requests in Hardware Transactional Memory Systems - Techniques for conflict detection in hardware transactional memory (HTM) are provided. In one aspect, a method for detecting conflicts in HTM includes the following steps. Conflict detection is performed eagerly by setting read and write bits in a cache as transactions having read and write requests are made. A given one of the transactions is stalled when a conflict is detected whereby more than one of the transactions are accessing data in the cache in a conflicting way. An address of the conflicting data is placed in a predictor. The predictor is queried whenever the write requests are made to determine whether they correspond to entries in the predictor. A copy of the data corresponding to entries in the predictor is placed in a store buffer. The write bits in the cache are set and the copy of the data in the store buffer is merged in at transaction commit. | 03-13-2014 |
20140082291 | SPECULATIVE PERMISSION ACQUISITION FOR SHARED MEMORY - In a processor, a method for speculative permission acquisition for access to a shared memory. The method includes receiving a store from a processor core to modify a shared cache line, and in response to receiving the store, marking the cache line as speculative. The cache line is then modified in accordance with the store. Upon receiving a modification permission, the modified cache line is subsequently committed. | 03-20-2014 |
20140089591 | SUPPORTING TARGETED STORES IN A SHARED-MEMORY MULTIPROCESSOR SYSTEM - The present embodiments provide a system for supporting targeted stores in a shared-memory multiprocessor. A targeted store enables a first processor to push a cache line to be stored in a cache memory of a second processor in the shared-memory multiprocessor. This eliminates the need for multiple cache-coherence operations to transfer the cache line from the first processor to the second processor. The system includes an interface, such as an application programming interface (API), and a system call interface or an instruction-set architecture (ISA) that provides access to a number of mechanisms for supporting targeted stores. These mechanisms include a thread-location mechanism that determines a location near where a thread is executing in the shared-memory multiprocessor, and a targeted-store mechanism that targets a store to a location (e.g., cache memory) in the shared-memory multiprocessor. | 03-27-2014 |
20140095796 | PERFORMANCE-DRIVEN CACHE LINE MEMORY ACCESS - According to one aspect of the present disclosure, a method and technique for performance-driven cache line memory access is disclosed. The method includes: receiving, by a memory controller of a data processing system, a request for a cache line; dividing the request into a plurality of cache subline requests, wherein at least one of the cache subline requests comprises a high priority data request and at least one of the cache subline requests comprises a low priority data request; servicing the high priority data request; and delaying servicing of the low priority data request until a low priority condition has been satisfied. | 04-03-2014 |
20140143496 | Self-Sizing Dynamic Cache for Virtualized Environments - A method and system for self-sizing dynamic cache for virtualized environments is disclosed. The preferred embodiment self sizes unequal portions of the total amount of cache and allocates to a plurality of active virtualized machines (VM) according to VM requirements and administrative standards. As a new VM may emerge and request an amount of cache, the cache controller reclaims currently used cache from the active VM and reallocates the unequal portions of cache required by each VM. To ensure cache availability, a quick reclamation amount of cache is immediately available to each new VM as it makes the request begins operation. After reallocation, the newly created VM may rely on a guaranteed minimum quota of cache to ensure performance. | 05-22-2014 |
20140156938 | CACHE REGION CONCEPT - A method to store objects in a memory cache is disclosed. A request is received from an application to store an object in a memory cache associated with the application. The object is stored in a cache region of the memory cache based on an identification that the object has no potential for storage in a shared memory cache and a determination that the cache region is associated with a storage policy that specifies that objects to be stored in the cache region are to be stored in a local memory cache and that a garbage collector is not to remove objects stored in the cache region from the local memory cache. | 06-05-2014 |
20140156939 | METHODOLOGY FOR FAST DETECTION OF FALSE SHARING IN THREADED SCIENTIFIC CODES - A profiling tool identifies a code region with a false sharing potential. A static analysis tool classifies variables and arrays in the identified code region. A mapping detection library correlates memory access instructions in the identified code region with variables and arrays in the identified code region while a processor is running the identified code region. The mapping detection library identifies one or more instructions at risk, in the identified code region, which are subject to an analysis by a false sharing detection library. A false sharing detection library performs a run-time analysis of the one or more instructions at risk while the processor is re-running the identified code region. The false sharing detection library determines, based on the performed run-time analysis, whether two different portions of the cache memory line are accessed by the generated binary code. | 06-05-2014 |
20140164707 | MITIGATING CONFLICTS FOR SHARED CACHE LINES - A computer program product for mitigating conflicts for shared cache lines between an owning core currently owning a cache line and a requestor core. The computer program product includes a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method. The method includes determining whether the owning core is operating in a transactional or non-transactional mode and setting a hardware-based reject threshold at a first or second value with the owning core determined to be operating in the transactional or non-transactional mode, respectively. The method further includes taking first or second actions to encourage cache line sharing between the owning core and the requestor core in response to a number of rejections of requests by the requestor core reaching the reject threshold set at the first or second value, respectively. | 06-12-2014 |
20140173213 | RAPID VIRTUAL MACHINE SUSPEND AND RESUME - A method of enabling “fast” suspend and “rapid” resume of virtual machines (VMs) employs a cache that is able to perform input/output operations at a faster rate than a storage device provisioned for the VMs. The cache may be local to a computer system that is hosting the VMs or may be shared cache commonly accessible to VMs hosted by different computer systems. The method includes the steps of saving the state of the VM to a checkpoint file stored in the cache and locking the checkpoint file so that data blocks of the checkpoint file are maintained in the cache and are not evicted, and resuming execution of the VM by reading into memory the data blocks of the checkpoint file stored in the cache. | 06-19-2014 |
20140181408 | MANAGING GLOBAL CACHE COHERENCY IN A DISTRIBUTED SHARED CACHING FOR CLUSTERED FILE SYSTEMS - Systems. Methods, and Computer Program Products are provided for managing a global cache coherency in a distributed shared caching for a clustered file systems (CFS). The CFS manages access permissions to an entire space of data segments by using the DSM module. In response to receiving a request to access one of the data segments, a calculation operation is performed for obtaining most recent contents of one of the data segments. The calculation operation performs one of providing the most recent contents via communication with a remote DSM module which obtains the one of the data segments from an associated external cache memory, instructing by the DSM module to read from storage the one of the data segments, and determining that any existing contents of the one of the data segments in the local external cache are the most recent contents. | 06-26-2014 |
20140195738 | I/O Write Request Handling in a Storage System - An improved method for I/O write request handling in a storage system comprising at least one normal storage device and at least one cache device. An I/O write request created by an external device is received. Two parallel threads are created for each write operation. A first thread attempts to execute the write operation using the at least one normal storage device without using the at least one cache device. A second thread monitors the first thread and is triggered to execute the write operation using the at least on cache device if the first thread has not finished the write operation within a given time threshold. In either case, an I/O write completion response is provided to the external device in order to avoid timing out of the write operation. The at least one cache device is freed from data written by the second thread if the first thread completes the write operation after the given time threshold. | 07-10-2014 |
20140201451 | METHOD, APPARATUS AND COMPUTER PROGRAMS PROVIDING CLUSTER-WIDE PAGE MANAGEMENT - A data processing system includes a plurality of virtual machines each having associated memory pages; a shared memory page cache that is accessible by each of the plurality of virtual machines; and a global hash map that is accessible by each of the plurality of virtual machines. The data processing system is configured such that, for a particular memory page stored in the shared memory page cache that is associated with two or more of the plurality of virtual machines, there is a single key stored in the global hash map that identifies at least a storage location in the shared memory page cache of the particular memory page. The system can be embodied at least partially in a cloud computing system. | 07-17-2014 |
20140201452 | FILL PARTITIONING OF A SHARED CACHE - Fill partitioning of a shared cache is described. In an embodiment, all threads running in a processor are able to access any data stored in the shared cache; however, in the event of a cache miss, a thread may be restricted such that it can only store data in a portion of the shared cache. The restrictions to storing data may be implemented for all cache miss events or for only a subset of those events. For example, the restrictions may be implemented only when the shared cache is full and/or only for particular threads. The restrictions may also be applied dynamically, for example, based on conditions associated with the cache. Different portions may be defined for different threads (e.g. in a multi-threaded processor) and these different portions may, for example, be separate and non-overlapping. Fill partitioning may be applied to any on-chip cache, for example, a L1 cache. | 07-17-2014 |
20140208034 | System And Method for Efficient Paravirtualized OS Process Switching - The exemplary embodiments described herein relate to systems and methods for improved process switching of a paravirtualized guest with a software-based memory management unit (“MMU”). One embodiment relates to a non-transitory computer readable storage medium including a set of instructions executable by a processor, the set of instructions, when executed, resulting in a performance of the following: create a plurality of new processes for each of a plurality of virtual environments, each of the virtual environments assigned one of a plurality of address space identifiers (“ASIDs”) stored in a cache memory, perform a process switch to one of the virtual environments thereby designating the one of the virtual environments as the active virtual environment, determine whether the active virtual environment has exhausted each of the ASIDs, and flush a cache memory when it is determined that the active virtual environment has exhausted each of the ASIDs. | 07-24-2014 |
20140208035 | CACHE CIRCUIT HAVING A TAG ARRAY WITH SMALLER LATENCY THAN A DATA ARRAY - A method is described that includes alternating cache requests sent to a tag array between data requests and dataless requests. | 07-24-2014 |
20140215159 | MANAGING CONCURRENT ACCESSES TO A CACHE - Various embodiments of the present invention allow concurrent accesses to a cache. A request to update an object stored in a cache is received. A first data structure comprising a new value for the object is created in response to receiving the request. A cache pointer is atomically modified to point to the first data structure. A second data structure comprising an old value for the cached object is maintained until a process, which holds a pointer to the old value of the cached object, at least one of one of ends and indicates that the old value is no longer needed. | 07-31-2014 |
20140223103 | PERSISTENT MEMORY FOR PROCESSOR MAIN MEMORY - Subject matter disclosed herein relates to a system of one or more processors that includes persistent memory. | 08-07-2014 |
20140258629 | SPECIFIC PREFETCH ALGORITHM FOR A CHIP HAVING A PARENT CORE AND A SCOUT CORE - Embodiments relate to a method, system, and computer program product for prefetching data on a chip having at least one scout core and a parent core. The method includes saving a prefetch code start address by the parent core. The prefetch code start address indicates where a prefetch code is stored. The prefetch code is specifically configured for monitoring the parent core based on a specific application being executed by the parent core. The method includes sending a broadcast interrupt signal by the parent core to the at least one scout core. The broadcast interrupt signal being sent based on the prefetch code start address being saved. The method includes monitoring the parent core by the prefetch code executed by at least one scout core. The scout core executes the prefetch code based on receiving the broadcast interrupt signal. | 09-11-2014 |
20140258630 | PREFETCHING FOR MULTIPLE PARENT CORES IN A MULTI-CORE CHIP - Embodiments relate to a method, system, and computer program product for prefetching data on a chip. The chip has at least one scout core, multiple parent cores that cooperate together to execute various tasks, and a shared cache that is common between the scout core and the multiple parent cores. An aspect of the embodiments includes monitoring the multiple parent cores by the at least one scout core through the shared cache for a shared cache access occurring in a base parent core. The method includes saving a fetch address by the at least one scout core based on the shared cache access occurring. The fetch address indicates a location of a specific line of cache requested by the base parent core. | 09-11-2014 |
20140258631 | Allocating Enclosure Cache In A Computing System - Allocating enclosure cache in a computing system that includes an enclosure and a plurality of enclosure attached servers, including: receiving, by the enclosure, memory access information from each of the plurality of enclosure attached servers; determining, by the enclosure in dependence upon the memory access information, an amount of enclosure cache to allocate as shared cache that can be accessed by two or more of the enclosure attached servers; and determining, by the enclosure in dependence upon the memory access information, an amount of enclosure cache to allocate to each enclosure attached server for exclusive use by the enclosure attached server. | 09-11-2014 |
20140258632 | Sharing Cache In A Computing System - Sharing cache in a computing system that includes a plurality of enclosure attached servers, including: identifying, by an enclosure, a first enclosure attached server that is not meeting a first predetermined performance threshold; identifying, by the enclosure, a second enclosure attached server that is meeting a second predetermined performance threshold; blocking, by the enclosure, access to a predetermined amount of cache on the second enclosure attached server by the second enclosure attached server; determining, by the enclosure, whether the second enclosure attached server is meeting the second predetermined performance threshold; responsive to determining that the second enclosure attached server is meeting the second predetermined performance threshold, lending, by the enclosure, the predetermined amount of cache on the second enclosure attached server to the first enclosure attached server. | 09-11-2014 |
20140258633 | Sharing Cache In A Computing System - Sharing cache in a computing system that includes a plurality of enclosure attached servers, including: identifying, by an enclosure, a first enclosure attached server that is not meeting a first predetermined performance threshold; identifying, by the enclosure, a second enclosure attached server that is meeting a second predetermined performance threshold; blocking, by the enclosure, access to a predetermined amount of cache on the second enclosure attached server by the second enclosure attached server; determining, by the enclosure, whether the second enclosure attached server is meeting the second predetermined performance threshold; responsive to determining that the second enclosure attached server is meeting the second predetermined performance threshold, lending, by the enclosure, the predetermined amount of cache on the second enclosure attached server to the first enclosure attached server. | 09-11-2014 |
20140258634 | Allocating Enclosure Cache In A Computing System - Allocating enclosure cache in a computing system that includes an enclosure and a plurality of enclosure attached servers, including: receiving, by the enclosure, memory access information from each of the plurality of enclosure attached servers; determining, by the enclosure in dependence upon the memory access information, an amount of enclosure cache to allocate as shared cache that can be accessed by two or more of the enclosure attached servers; and determining, by the enclosure in dependence upon the memory access information, an amount of enclosure cache to allocate to each enclosure attached server for exclusive use by the enclosure attached server. | 09-11-2014 |
20140297960 | MULTI-CORE SYSTEM AND METHOD OF DATA CONSISTENCY - A system comprises a plurality of cores and a communication bus enabling the cores to communicate with one another, a core having a processor and of at least one cache memory area. At least one core comprises a table of patterns storing a set of patterns, a pattern corresponding to a series of memory addresses associated with a digital data item made up of binary words stored at these addresses. This core also comprises means for mapping one of the memory addresses AdB of a digital data item to a pattern that is associated with it when said core needs to access this data item and means for transmitting a unique message for access to a digital data item located in the cache memory of at least one other core of the system, said message including the memory addresses that make up the pattern of the data item sought. | 10-02-2014 |
20140310474 | METHODS AND SYSTEMS FOR IMPLEMENTING TRANSCENDENT PAGE CACHING - A method of implementing a shared cache between a plurality of virtual machines may include maintaining the plurality of virtual machines on one or more physical machines. Each of the plurality of virtual machines may include a private cache. The method may also include determining portions of the private caches that are idle and maintaining a shared cache that comprises the portions of the private caches that are idle. The method may additionally include storing data associated with the plurality of virtual machines in the shared cache and load balancing use of the shared cache between the plurality of virtual machines. | 10-16-2014 |
20140310475 | ATOMIC EXECUTION OVER ACCESSES TO MULTIPLE MEMORY LOCATIONS IN A MULTIPROCESSOR SYSTEM - A method and central processing unit supporting atomic access of shared data by a sequence of memory access operations. A processor status flag is reset. A processor executes, subsequent to the setting of the processor status flag, a sequence of program instructions with instructions accessing a subset of shared data contained within its local cache. During execution of the sequence of program instructions and in response to a modification by another processor of the subset of shared data, the processor status flag is set. Subsequent to the executing the sequence of program instructions and based upon the state of the processor status flag, either a first program processing or a second program processing is executed. In some examples the first program processing includes storing results data into the local cache and the second program processing includes discarding the results data. | 10-16-2014 |
20140317353 | Method and Apparatus for Managing Write Back Cache - A network services processor includes an input/output bridge that avoids unnecessary updates to memory when cache blocks storing processed packet data are no longer required. The input/output bridge monitors requests to free buffers in memory received from cores and IO units in the network services processor. Instead of writing the cache block back to the buffer in memory that will be freed, the input/output bridge issues don't write back commands to a cache controller to clear the dirty bit for the selected cache block, thus avoiding wasteful write-backs from cache to memory. After the dirty bit is cleared, the buffer in memory is freed, that is, made available for allocation to store data for another packet. | 10-23-2014 |
20140325158 | MANAGING GLOBAL CACHE COHERENCY IN A DISTRIBUTED SHARED CACHING FOR CLUSTERED FILE SYSTEMS - Systems. Methods, and Computer Program Products are provided for managing a global cache coherency in a distributed shared caching for a clustered file systems (CFS). The CFS manages access permissions to an entire space of data segments by using the DSM module. In response to receiving a request to access one of the data segments, a calculation operation is performed for obtaining most recent contents of one of the data segments. The calculation operation performs one of providing the most recent contents via communication with a remote DSM module which obtains the one of the data segments from an associated external cache memory, instructing by the DSM module to read from storage the one of the data segments, and determining that any existing contents of the one of the data segments in the local external cache are the most recent contents. | 10-30-2014 |
20140344523 | System and Method of Selective READ Cache Retention for a Rebooted Node of a Multiple-Node Storage Cluster - The disclosure is directed to a system and method for managing READ cache memory of at least one node of a multiple-node storage cluster. According to various embodiments, a cache data and a cache metadata are stored for data transfers between a respective node (hereinafter “first node”) and regions of a storage cluster. When the first node is disabled, data transfers are tracked between one or more active nodes of the plurality of nodes and cached regions of the storage cluster. When the first node is rebooted, at least a portion of valid cache data is retained based upon the tracked data transfers. Accordingly, local cache memory does not need to be entirely rebuilt each time a respective node is rebooted. | 11-20-2014 |
20140351523 | System and Method of Rebuilding READ Cache for a Rebooted Node of a Multiple-Node Storage Cluster - The disclosure is directed to a system and method for managing cache memory of at least one node of a multiple-node storage cluster. According to various embodiments, a first cache data and a first cache metadata are stored for data transfers between a respective node and regions of a storage cluster receiving at least a first selected number of data transfer requests. When the node is rebooted, a second (new) cache data is stored to replace the first (old) cache data. The second cache data is compiled utilizing the first cache metadata to identify previously cached regions of the storage cluster receiving at least a second selected number of data transfer requests after the node is rebooted. The second selected number of data transfer requests is less than the first selected number of data transfer requests to enable a rapid build of the second cache data. | 11-27-2014 |
20140351524 | DEAD BLOCK PREDICTORS FOR COOPERATIVE EXECUTION IN THE LAST LEVEL CACHE - A cache memory eviction method includes maintaining thread-aware cache access data per cache block in a cache memory, wherein the cache access data is indicative of a number of times a cache block is accessed by a first thread, associating a cache block with one of a plurality of bins based on cache access data values of the cache block, and selecting a cache block to evict from a plurality of cache block candidates based, at least in part, upon the bins with which the cache block candidates are associated. | 11-27-2014 |
20140359225 | MULTI-CORE PROCESSOR AND MULTI-CORE PROCESSOR SYSTEM - Disclosed herein is a multi-core processor including: a plurality of processor cores; a shared data cache storing cache data previously accessed by at least one of the plurality of processor cores; and an address decoder comparing an address value of a data required by at least one of the plurality of processor cores and a set address register value with each other and allowing at least one of the plurality of processor cores to access the shared data cache or a separate memory in which non-cacheable data that are not stored in the shared data cache are stored. | 12-04-2014 |
20150012710 | CACHE STICKINESS INDEX FOR CONTENT DELIVERY NETWORKING SYSTEMS - Various embodiments of the present disclosure relate to a cache stickiness index for providing measurable metrics associated with caches of a content delivery networking system. In one embodiment, a method for generating a cache stickiness index, including a cluster stickiness index and a region stickiness index, is disclosed. In embodiments, the cluster stickiness index is generated by comparing cache keys shared among a plurality of front-end clusters. In embodiments, the region stickiness index is generated by comparing cache keys shared among a plurality of data centers. In one embodiment, a system comprising means for generating a stickiness index is disclosed. | 01-08-2015 |
20150012711 | SYSTEM AND METHOD FOR ATOMICALLY UPDATING SHARED MEMORY IN MULTIPROCESSOR SYSTEM - A system for operating a shared memory of a multiprocessor system includes a set of processor cores and a corresponding set of core local caches, a set of I/O devices and a corresponding set of I/O device local caches. Read and write operations performed on a core local cache, an I/O device local cache, and the shared memory are governed by a cache coherence protocol (CCP) that ensures that the shared memory is updated atomically. | 01-08-2015 |
20150019819 | PREFETCHING FOR MULTIPLE PARENT CORES IN A MULTI-CORE CHIP - Embodiments relate to a method and computer program product for prefetching data on a chip. The chip has at least one scout core, multiple parent cores that cooperate together to execute various tasks, and a shared cache that is common between the scout core and the multiple parent cores. An aspect of the embodiments includes monitoring the multiple parent cores by the at least one scout core through the shared cache for a shared cache access occurring in a base parent core. The method includes saving a fetch address by the at least one scout core based on the shared cache access occurring. The fetch address indicates a location of a specific line of cache requested by the base parent core. | 01-15-2015 |
20150019820 | PREFETCHING FOR A PARENT CORE IN A MULTI-CORE CHIP - Embodiments of the invention relate to prefetching data on a chip having at least one scout core, at least one parent core, and a shared cache that is common between the at least one scout core and the at least one parent core. A prefetch code is executed by the scout core for monitoring the parent core. The prefetch code executes independently from the parent core. The scout core determines that at least one specified data pattern has occurred in the parent core based on monitoring the parent core. A prefetch request is sent from the scout core to the shared cache. The prefetch request is sent based on the at least one specified pattern being detected by the scout core. A data set indicated by the prefetch request is sent to the parent core by the shared cache. | 01-15-2015 |
20150019821 | SPECIFIC PREFETCH ALGORITHM FOR A CHIP HAVING A PARENT CORE AND A SCOUT CORE - Embodiments relate to a method and computer program product for prefetching data on a chip having at least one scout core and a parent core. The method includes saving a prefetch code start address by the parent core. The prefetch code start address indicates where a prefetch code is stored. The prefetch code is specifically configured for monitoring the parent core based on a specific application being executed by the parent core. The method includes sending a broadcast interrupt signal by the parent core to the at least one scout core. The broadcast interrupt signal being sent based on the prefetch code start address being saved. The method includes monitoring the parent core by the prefetch code executed by at least one scout core. The scout core executes the prefetch code based on receiving the broadcast interrupt signal. | 01-15-2015 |
20150039834 | SHARING LOCAL CACHE FROM A FAILOVER NODE - Sharing local cache from a failover node, including: determining, by a managing compute node, whether a first compute node and a second compute node each have a local cache, where the second compute node is a mirrored copy of the first compute node; responsive to determining that the first compute node and the second compute node each have a local cache, combining, by the managing compute node, local cache on the first compute node and local cache on the second compute node into unified logical cache; receiving, by the managing compute node, a memory access request; and sending, by the managing compute node, the memory access request to an appropriate local cache in the unified logical cache. | 02-05-2015 |
20150052311 | MANAGEMENT OF TRANSACTIONAL MEMORY ACCESS REQUESTS BY A CACHE MEMORY - In a data processing system having a processor core and a shared memory system including a cache memory that supports the processor core, a transactional memory access request is issued by the processor core in response to execution of a memory access instruction in a memory transaction undergoing execution by the processor core. In response to receiving the transactional memory access request, dispatch logic of the cache memory evaluates the transactional memory access request for dispatch, where the evaluation includes determining whether the memory transaction has a failing transaction state. In response to determining the memory transaction has a failing transaction state, the dispatch logic refrains from dispatching the memory access request for service by the cache memory and refrains from updating at least replacement order information of the cache memory in response to the transactional memory access request. | 02-19-2015 |
20150052312 | PROTECTING THE FOOTPRINT OF MEMORY TRANSACTIONS FROM VICTIMIZATION - A processing unit includes a processor core and a cache memory. Entries in the cache memory are grouped in multiple congruence classes. The cache memory includes tracking logic that tracks a transaction footprint including cache line(s) accessed by transactional memory access request(s) of a memory transaction. The cache memory, responsive to receiving a memory access request that specifies a target cache line having a target address that maps to a congruence class, forms a working set of ways in the congruence class containing cache line(s) within the transaction footprint and updates a replacement order of the cache lines in the congruence class. Based on membership of the at least one cache line in the working set, the update promotes at least one cache line that is not the target cache line to a replacement order position in which the at least one cache line is less likely to be replaced. | 02-19-2015 |
20150067263 | SERVICE PROCESSOR PATCH MECHANISM - A microprocessor includes a plurality of processing cores, a service processing unit and a memory accessible by both the service processing unit and the plurality of processing cores. At least one of the plurality of processing cores is configured to write a patch to the memory. The patch comprises one or more instructions to be fetched from the memory and executed by the service processing unit after written to the memory by the at least one of the plurality of processing cores. | 03-05-2015 |
20150081976 | CACHING FOR HETEROGENEOUS PROCESSORS - A multi-core processor providing heterogeneous processor cores and a shared cache is presented. | 03-19-2015 |
20150081977 | EXTENDING A CACHE COHERENCY SNOOP BROADCAST PROTOCOL WITH DIRECTORY INFORMATION - In one embodiment, a method includes receiving a read request from a first caching agent, determining whether a directory entry associated with the memory location indicates that the information is not present in a remote caching agent, and if so, transmitting the information from the memory location to the first caching agent before snoop processing with respect to the read request is completed. Other embodiments are described and claimed. | 03-19-2015 |
20150089145 | MULTIPLE CORE PROCESSING WITH HIGH THROUGHPUT ATOMIC MEMORY OPERATIONS - A processor comprising multiple processor cores and a bus for exchanging data between the multiple processor cores is disclosed. Each of the multiple processor cores includes: at least one processor register; a cache for storing at least one cache line of memory; a load store unit for executing a memory command to exchange data between the cache and the at least one processor register; an atomic memory operation unit for executing an atomic memory operation on the at least one cache line of memory; and a high throughput register for storing a status indicating a high throughput or a normal status. The load store unit is operable to transfer the atomic memory operation to the atomic memory operation unit of a designated processor core if the atomic memory operation status is the high throughput status using the bus. | 03-26-2015 |
20150095580 | SCALABLY MECHANISM TO IMPLEMENT AN INSTRUCTION THAT MONITORS FOR WRITES TO AN ADDRESS - A processor includes a cache-side address monitor unit corresponding to a first cache portion of a distributed cache that has a total number of cache-side address monitor storage locations less than a total number of logical processors of the processor. Each cache-side address monitor storage location is to store an address to be monitored. A core-side address monitor unit corresponds to a first core and has a same number of core-side address monitor storage locations as a number of logical processors of the first core. Each core-side address monitor storage location is to store an address, and a monitor state for a different corresponding logical processor of the first core. A cache-side address monitor storage overflow unit corresponds to the first cache portion, and is to enforce an address monitor storage overflow policy when no unused cache-side address monitor storage location is available to store an address to be monitored. | 04-02-2015 |
20150095581 | DATA CACHING POLICY IN MULTIPLE TENANT ENTERPRISE RESOURCE PLANNING SYSTEM - A cache manager application provides a data caching policy in a multiple tenant enterprise resource planning (ERP) system. The cache manager application manages multiple tenant caches in a single process. The application applies the caching policy. The caching policy optimizes system performance compared to local cache optimization. As a result, tenants with high cache consumption receive a larger portion of caching resources. | 04-02-2015 |
20150143051 | Providing Common Caching Agent For Core And Integrated Input/Output (IO) Module - In one embodiment, the present invention includes a multicore processor having a plurality of cores, a shared cache memory, an integrated input/output (IIO) module to interface between the multicore processor and at least one IO device coupled to the multicore processor, and a caching agent to perform cache coherency operations for the plurality of cores and the IIO module. Other embodiments are described and claimed. | 05-21-2015 |
20150301943 | METHOD AND DEVICE FOR PROCESSING DATA - A method and device are provided for processing data. The method includes, after receiving data input by a data bus, according to a destination indication of the data and a valid bit field indication of the data, writing the data input by the data bus into an uplink side shared cache, polling the uplink side shared cache according to a fixed timeslot order, reading out the data in the uplink side shared cache, and outputting the data to respective corresponding channels. The method and device enable effective saving of cache resources, reduction of pressure on area and timing, and improvement of cache utilization while reliably achieving data cache and bit width conversion. | 10-22-2015 |
20150301954 | SYSTEMS AND METHODS FOR ACCESSING A UNIFIED TRANSLATION LOOKASIDE BUFFER - Systems and methods for accessing a unified translation lookaside buffer (TLB) are disclosed. A method includes receiving an indicator of a level one translation lookaside buffer (L1TLB) miss corresponding to a request for a virtual address to physical address translation, searching a cache that includes virtual addresses and page sizes that correspond to translation table entries (TTEs) that have been evicted from the L1TLB, where a page size is identified, and searching a second level TLB and identifying a physical address that is contained in the second level TLB. Access is provided to the identified physical address. | 10-22-2015 |
20150309931 | PERSISTENT MEMORY FOR PROCESSOR MAIN MEMORY - Subject matter disclosed herein relates to a system of one or more processors that includes persistent memory. | 10-29-2015 |
20150324288 | SYSTEM AND METHOD FOR IMPROVING SNOOP PERFORMANCE - The present disclosure is directed to hardware hash tables, and more specifically, to generation of a cache coherent system such as in a Network on Chip (NoC). The present disclosure is further directed to a directory structure that includes a new field, referred to, for instance as, encoded value, which indicates the original owner of a dirty line. As an original holder may have held or modified the original line, by tracking the original holder, example implementations can track the agents that are potentially dirty, as the encoded value can indicate the agent with the most recently unique line, which can then be shared with the other agents. | 11-12-2015 |
20150331795 | MEMORY ACCESS TRACING METHOD - A method for identifying, in a system including two or more computing devices that are able to communicate with each other, with each computing device having with a cache and connected to a corresponding memory, a computing device accessing one of the memories, includes monitoring memory access to any of the memories; monitoring cache coherency commands between computing devices; and identifying the computing device accessing one of the memories by using information related to the memory access and cache coherency commands. | 11-19-2015 |
20150339229 | APPARATUS AND METHOD FOR DETERMINING A SECTOR DIVISION RATIO OF A SHARED CACHE MEMORY - An apparatus includes a shared cache memory and a controller. The shared cache memory is configured to be divided into sectors by assigning one or more ways to each sector in accordance with a reusability level of data. The controller changes a sector division ratio indicating a ratio between way counts of the divided sectors of the shared cache memory, where the way count is a number of ways assigned to each sector. When first and second jobs are being executed in parallel, in response to a designation of a program of the second job, the controller calculates the sector division ratio, based on data access amount including a size and an access count of data accessed by the first and second jobs and a volume of the shared cache memory, and changes the sector division ratio of the shared cache memory to the calculated sector division ratio. | 11-26-2015 |
20150339230 | MANAGING OUT-OF-ORDER MEMORY COMMAND EXECUTION FROM MULTIPLE QUEUES WHILE MAINTAINING DATA COHERENCY - Responsive to selecting a particular queue from among at least two queues to place an incoming event into within a particular entry from among multiple entries ordered upon arrival of the particular queue each comprising a separate collision vector, a memory address for the incoming event is compared with each queued memory address for each queued event in the other entries in the at least one other queue. Responsive to the memory address for the incoming event matching at least one particular queued memory address for at least one particular queued event in the at least one other queue, at least one particular bit is set in a particular collision vector for the particular entry in at least one bit position from among the bits corresponding with at least one row entry position of the at least one particular queued memory address within the other entries. | 11-26-2015 |
20150347300 | SYNCHRONIZING UPDATES OF PAGE TABLE STATUS INDICATORS IN A MULTIPROCESSING ENVIRONMENT - A synchronization capability to synchronize updates to page tables by forcing updates in cached entries to be made visible in memory (i.e., in in-memory page table entries). A synchronization instruction is used that ensures after the instruction has completed that updates to the cached entries that occurred prior to the synchronization instruction are made visible in memory. Synchronization may be used to facilitate memory management operations, such as bulk operations used to change a large section of memory to read-only, operations to manage a free list of memory pages, and/or operations associated with terminating processes. | 12-03-2015 |
20150347322 | SPECULATIVE QUERYING THE MAIN MEMORY OF A MULTIPROCESSOR SYSTEM - A method of accessing data in a multiprocessor system, wherein the system includes a plurality of processors, with each processor being associated with a respective cache memory, a cache memory management module, a main memory and a main memory management module, the method including: receiving by the cache memory management module an initial request for access to data by a processor; first transmitting by the cache memory management module a first request with respect to the data to at least one cache memory; second transmitting in parallel to the first transmitting by the cache memory management module, a second request with respect to the data to the main memory management module; checking by the main memory management module, whether to initiate querying of the main memory or not, and querying or not by the main memory management module, of the main memory in accordance with the said checking. | 12-03-2015 |
20150363312 | ELECTRONIC SYSTEM WITH MEMORY CONTROL MECHANISM AND METHOD OF OPERATION THEREOF - An electronic system includes: a second memory module; a first memory module coupled to the second memory module; and a multicast controller for managing a cache on the first memory module for the second memory module. | 12-17-2015 |
20150378631 | TRANSACTIONAL MEMORY OPERATIONS WITH READ-ONLY ATOMICITY - Execution of a transaction mode setting instruction causes a computer processor to be in an atomic read-only mode ignoring conflicts to certain write-sets of a transaction during transactional execution. Read-set conflicts may still cause a transactional abort. Absent any aborting, the transaction's execution may complete, by committing transactional stores to memory and updating architecture states. | 12-31-2015 |
20150378632 | TRANSACTIONAL MEMORY OPERATIONS WITH WRITE-ONLY ATOMICITY - Execution of a transaction mode setting instruction causes a computer processor to be in an atomic write-only mode ignoring conflicts to certain read-sets of a transaction during transactional execution. Write-set conflicts may still cause a transactional abort. Absent any aborting, the transaction's execution may complete, by committing transactional stores to memory and updating architecture states. | 12-31-2015 |
20150378895 | DETECTING CACHE CONFLICTS BY UTILIZING LOGICAL ADDRESS COMPARISONS IN A TRANSACTIONAL MEMORY - A processor in a multi-processor configuration is configured perform dynamic address translation from logical addresses to real address and to detect memory conflicts for shared logical memory in transactional memory based on logical (virtual) addresses comparisons. | 12-31-2015 |
20150378896 | COLLECTING MEMORY OPERAND ACCESS CHARACTERISTICS DURING TRANSACTIONAL EXECUTION - A transactional execution of a set of instructions in a transaction of a program may be initiated to collect memory operand access characteristics of a set of instructions of a transaction during the transactional execution. The memory operand access characteristics may be stored upon a termination of the transactional execution of the set of instructions. The memory operand access characteristics may include an address of an accessed storage location, a count of a number of times the storage location is accessed, a purpose value indicating whether the storage location is accessed for a fetch, store, or update operation, a count of a number of times the storage location is accessed for one or more of a fetch, store, or update operation; a translation mode in which the storage location is accessed; and an addressing mode. | 12-31-2015 |
20150378897 | SPECULATION CONTROL FOR IMPROVING TRANSACTION SUCCESS RATE, AND INSTRUCTION THEREFOR - Throttling instruction execution in a transaction operating in a processor configured to execute memory instructions out-of-order in a pipelined processor, wherein memory instructions are instructions for accessing operands in memory is provided. Included is executing, by the processor, instructions of a transaction comprising determining whether the transaction is in throttling mode and based on the transaction being in throttling mode, executing memory instructions in-program-order. Also included is based on the transaction not-being in throttling mode, executing memory instructions out-of-program order. | 12-31-2015 |
20150378898 | TRANSACTIONAL EXECUTION PROCESSOR HAVING A CO-PROCESSOR ACCELERATOR, BOTH SHARING A HIGHER LEVEL CACHE - A higher level shared cache of a hierarchical cache of a multi-processor system utilizes transaction identifiers to manage memory conflicts in corresponding transactions. The higher level cache is shared with two or more processors. A processor may have a corresponding accelerator that performs operations on behalf of the processor. Transaction indicators are set in the higher level cache corresponding to the cache lines being accessed. The transaction aborts if a memory conflict with the transaction's cache lines from another transaction is detected, and the corresponding cache lines are invalidated. For a successfully completing transaction, the corresponding cache lines are committed and the data from store operations is stored. | 12-31-2015 |
20150378899 | TRANSACTIONAL EXECUTION PROCESSOR HAVING A CO-PROCESSOR ACCELERATOR, BOTH SHARING A HIGHER LEVEL CACHE - A higher level shared cache of a hierarchical cache of a multi-processor system utilizes transaction identifiers to manage memory conflicts in corresponding transactions. The higher level cache is shared with two or more processors. A processor may have a corresponding accelerator that performs operations on behalf of the processor. Transaction indicators are set in the higher level cache corresponding to the cache lines being accessed. The transaction aborts if a memory conflict with the transaction's cache lines from another transaction is detected, and the corresponding cache lines are invalidated. For a successfully completing transaction, the corresponding cache lines are committed and the data from store operations is stored. | 12-31-2015 |
20150378905 | CO-PROCESSOR MEMORY ACCESSES IN A TRANSACTIONAL MEMORY - Monitoring, by a processor having a cache, addresses accessed by a co-processor associated with the processor during transactional execution of a transaction by the processor. The processor executes a transactional memory (TM) transaction, including receiving, by the processor, a memory address range of data that a co-processor may access to perform a co-processor operation. The processor saves the memory address range. Based on receiving, by the processor, a cache coherency request that conflicts with the saved address range, the processor aborts the TM transaction. | 12-31-2015 |
20150378912 | SPECULATION CONTROL FOR IMPROVING TRANSACTION SUCCESS RATE, AND INSTRUCTION THEREFOR - Throttling instruction execution in a transaction operating in a processor configured to execute memory instructions out-of-order in a pipelined processor, wherein memory instructions are instructions for accessing operands in memory is provided. Included is executing, by the processor, instructions of a transaction comprising determining whether the transaction is in throttling mode and based on the transaction being in throttling mode, executing memory instructions in-program-order. Also included is based on the transaction not-being in throttling mode, executing memory instructions out-of-program order. | 12-31-2015 |
20150378915 | MEMORY PERFORMANCE WHEN SPECULATION CONTROL IS ENABLED, AND INSTRUCTION THEREFOR - Throttling execution in a transaction operating in a processor configured to execute memory instructions out-of-program-order in a pipelined processor, wherein memory instructions are instructions for accessing operands in memory. Included is executing instructions of a transaction. Also included is determining whether the transaction is in throttling mode and based on determining that a transaction is in throttling mode, executing memory instructions in-program-order and dynamically prefetching memory operands of memory instructions. | 12-31-2015 |
20150378916 | MITIGATING BUSY TIME IN A HIGH PERFORMANCE CACHE - Various embodiments mitigate busy time in a hierarchical store-through memory cache structure including a cache directory associated with a memory cache. The cache directory is divided into a plurality of portions each associated with a portion of memory cache. A determination is made that a first subpipe of a shared cache pipeline comprises a non-store request. The shared pipeline is communicatively coupled to the plurality of portions of the cache directory. A store command is prevented from being placed in a second subpipe of the shared cache pipeline based on determining that a first subpipe of the shared cache pipeline comprises a non-store request. Simultaneous cache lookup operations are supported between the plurality of portions of the cache directory and cache write operations. Two or more store commands simultaneously processed in a shared cache pipeline communicatively coupled to the plurality of portions of the cache directory. | 12-31-2015 |
20160004640 | SALVAGING LOCK ELISION TRANSACTIONS - A transactional memory system salvages a hardware lock elision (HLE) transaction. A processor of the transactional memory system executes a lock-acquire instruction in an HLE environment and records information about a lock elided to begin HLE transactional execution of a code region. The processor detects a pending point of failure in the code region during the HLE transactional execution. The processor stops HLE transactional execution at the point of failure in the code region. The processor acquires the lock using the information, and based on acquiring the lock, commits the speculative state of the stopped HLE transactional execution. The processor starts non-transactional execution at the point of failure in the code region. | 01-07-2016 |
20160004641 | SALVAGING LOCK ELISION TRANSACTIONS - A transactional memory system salvages hardware lock elision (HLE) transactions. A computer system of the transactional memory system records information about locks elided to begin HLE transactional execution of first and second transactional code regions. The computer system detects a pending cache line conflict of a cache line, and based on the detecting stops execution of the first code region of the first transaction and the second code region of the second transaction. The computer system determines that the first lock and the second lock are different locks and uses the recorded information about locks elided to acquire the first lock of the first transaction and the second lock of the second transaction. The computer system commits speculative state of the first transaction and the second transaction and the computer system continues execution of the first code region and the second code region non-transactionally. | 01-07-2016 |
20160011975 | Dynamically Controlling Cache Size To Maximize Energy Efficiency | 01-14-2016 |
20160062887 | FLEXIBLE ARBITRATION SCHEME FOR MULTI ENDPOINT ATOMIC ACCESSES IN MULTICORE SYSTEMS - The MSMC (Multicore Shared Memory Controller) described is a module designed to manage traffic between multiple processor cores, other mastering peripherals or DMA, and the EMIF (External Memory InterFace)in a multicore SoC. The invention unifies all transaction sizes belonging to a slave previous to arbitrating the transactions in order to reduce the complexity of the arbitration process and to provide optimum bandwidth management among all masters. Two consecutive slots are assigned per cache line access to automatically guarantee the atomicity of all transactions within a single cache line. The need for synchronization among all the banks of a particular SRAM is eliminated, as synchronization is accomplished by assigning back to back slots. | 03-03-2016 |
20160070658 | MULTI-LEVEL, HARDWARE-ENFORCED DOMAIN SEPARATION USING A SEPARATION KERNEL ON A MULTICORE PROCESSOR WITH A SHARED CACHE - A separation kernel isolating memory domains within a shared system memory is executed on the cores of a multicore processor having hardware security enforcement for static virtual address mappings, to implement an efficient embedded multi-level security system. Shared caches are either disabled or constrained by the same static virtual address mappings using the hardware security enforcement available, to isolate domains accessible to select cores and reduce security risks from data co-mingling. | 03-10-2016 |
20160085460 | OPTIMIZED READ ACCESS TO SHARED DATA VIA MONITORING OF MIRRORING OPERATIONS - A method and system for optimized read access to shared data via monitoring of mirroring operations are described. A data storage system performs operations that include one controller in a dual-controller host storage appliance in an asymmetric active/active configuration receiving a request from the host for data on a logical unit number owned by the partner controller. The receiving controller, which has a mirror cache of the partner controller's memory for failure recovery, accesses the mirror cache using a data structure that was populated during previous mirror operations. If the data is found in the mirror cache, it is read from the cache and returned to the requesting host without having to contact the partner controller for the data. | 03-24-2016 |
20160085675 | Utilization of Processor Capacity at Low Operating Frequencies - In an embodiment, a processor includes one or more cores including a first core operable at an operating voltage between a minimum operating voltage and a maximum operating voltage. The processor also includes a power control unit including first logic to enable coupling of ancillary logic to the first core responsive to the operating voltage being less than or equal to a threshold voltage, and to disable the coupling of the ancillary logic to the first core responsive to the operating voltage being greater than the threshold voltage. Other embodiments are described and claimed. | 03-24-2016 |
20160092356 | INTEGRATED PAGE-SHARING CACHE - In an embodiment, a method can include storing a plurality of volumes on persistent media. A set of the volumes can store at least one portion of a same copy of data. The method can further include caching the set of the volumes as a single group. In an embodiment, the plurality of volumes can include at least one of drives, snapshots, clones and replicas. | 03-31-2016 |
20160110283 | ON-DEMAND EXPANSION OF SYNCHRONIZATION PRIMITIVES - Disclosed are techniques and systems for providing on-demand expansion of a non-cache-aware synchronization primitive to a cache-aware form. The expansion may occur on-demand when it becomes necessary to do so for performance and throughput purposes. Expansion of the synchronization primitive may be based at least in part on a level of cache-line contention resulting from operations on the non-cache-aware synchronization primitive. The synchronization primitive in the expanded (cache-aware) form may be represented by a data structure that allocates individual cache lines to respective processors of a multiprocessor system in which the synchronization primitive is implemented. Once expanded, the cache-aware synchronization primitive may be contracted to its non-cache-aware form. | 04-21-2016 |
20160117246 | METHOD AND APPARATUS FOR CROSS-CORE COVERT CHANNEL - Passing messages between two virtual machines that use a single multicore processor having inclusive cache includes using a cache-based covert channel. A message bit in a first machine is interpreted as a lowest level cache flush. The cache flush in the first machine clears a L1 level cache in the second machine because of the inclusiveness property of the multicore processor cache. The second machine reads its cache and records access time. If the access time is long, then the cache was previously cleared and a logical 1 was sent by the first machine. A short access time is interpreted as a logical 0 by the second machine. By sending many bits, a message can be sent from the first virtual machine to the second virtual machine via the cache-based covert channel without using non-cache memory as a covert channel. | 04-28-2016 |
20160124643 | DIRECT NON-VOLATILE CACHE ACCESS ACROSS DEVICES - A system and method of providing direct data access between a non-volatile cache and a set of storage devices in a computing system. A system is disclosed that includes: a processing core embedded in a controller card that controls a non-volatile cache system; and a direct access manager for directing the processing core, wherein the direct access manager includes: a switch configuration system that includes logic to control a switch for either a direct access mode or a CPU access mode, wherein the switch couples each of the storage devices, a local bus, and the non-volatile cache system; a command output system that includes logic to output data transfer commands; and a data transfer system that includes logic to manage the flow of data directly between the non-volatile memory and the set of storage devices; and an arbitrator that arbitrates data traffic flow through the switch. | 05-05-2016 |
20160124651 | METHOD FOR PERFORMING RANDOM READ ACCESS TO A BLOCK OF DATA USING PARALLEL LUT READ INSTRUCTION IN VECTOR PROCESSORS - This invention deals with the problem of paralleling random read access within a reasonably sized block of data for a vector SIMD processor. The invention sets up plural parallel look up tables, moves data from main memory to each plural parallel look up table and then employs a look up table read instruction to simultaneously move data from each parallel look up table to a corresponding part a vector destination register. This enables data processing by vector single instruction multiple data (SIMD) operations. This vector destination register load can be repeated if the tables store more used data. New data can be loaded into the original tables if appropriate. A level one memory is preferably partitioned as part data cache and part directly addressable memory. The look up table memory is stored in the directly addressable memory. | 05-05-2016 |
20160132431 | STORE CACHE FOR TRANSACTIONAL MEMORY - A method to merge one or more non-transactional stores and one or more thread-specific transactional stores into one or more cache line templates in a store buffer in a store cache. The method receives a thread-specific non-transactional store address and a first data, maps the store address to a first cache line template, and merges the first data into the first cache line template, according to a store policy. The method further receives a thread-specific transactional store address and a second data, maps the thread-specific store address into a second cache line template, according to a store policy. The method further writes back a copy of a cache line template to a cache and invalidates a third cache line template, which frees the third cache line template from a store address mapping. | 05-12-2016 |
20160147654 | CACHE MEMORY WITH UNIFIED TAG AND SLICED DATA - A cache memory is shared by N cores of a processor. The cache memory includes a unified tag part and a sliced data part partitioned into N data slices. Each data slice of the N data slices is physically local to a respective one of the N cores and physically remote from the other N−1 cores. N is an integer greater than one. For each core of the N cores, the cache memory biases allocations caused by the core towards a physically local slice of the core. The physically local slice is one of the N data slices and is physically local to the core. | 05-26-2016 |
20160147655 | GENERATING APPROXIMATE USAGE MEASUREMENTS FOR SHARED CACHE MEMORY SYSTEMS - Generating approximate usage measurements for shared cache memory systems is disclosed. In one aspect, a cache memory system is provided. The cache memory system comprises a shared cache memory system. A subset of the shared cache memory system comprises a Quality of Service identifier (QoSID) tracking tag configured to store a QoSID tracking indicator for a QoS class. The shared cache memory system further comprises a cache controller configured to receive a memory access request comprising a QoSID, and is configured to access a cache line corresponding to the memory access request. The cache controller is also configured to determine whether the QoSID of the memory access request corresponds to a cache line assigned to the QoSID. If so, the cache controller is additionally configured to update the QoSID tracking tag. | 05-26-2016 |
20160147656 | PROVIDING SHARED CACHE MEMORY ALLOCATION CONTROL IN SHARED CACHE MEMORY SYSTEMS - Providing shared cache memory allocation control in shared cached memory systems is disclosed. In one aspect, a cache controller of a shared cache memory system comprising a plurality of cache lines is provided. The cache controller comprises a cache allocation circuit providing a minimum mapping bitmask for mapping a Quality of Service (QoS) class to a minimum partition of the cache lines, and a maximum mapping bitmask for mapping the QoS class to a maximum partition of the cache lines. The cache allocation circuit receives a memory access request comprising a QoS identifier (QoSID) of the QoS class, and is configured to determine whether the memory access request corresponds to a cache line of the plurality of cache lines. If not, the cache allocation circuit selects, as a target partition, the minimum partition mapped to the QoS class or the maximum partition mapped to the QoS class. | 05-26-2016 |
20160147661 | CONFIGURATION BASED CACHE COHERENCY PROTOCOL SELECTION - Topology of clusters of processors of a computer configuration, configured to support any of a plurality of cache coherency protocols, is discovered at initialization time to determine which one of the plurality of cache coherency protocols is to be used to handle coherency requests of the configuration | 05-26-2016 |
20160154734 | CACHE MEMORY DEVICE AND ELECTRONIC SYSTEM INCLUDING THE SAME | 06-02-2016 |
20160154735 | ELECTRONIC DEVICE AND METHOD FOR CONTROLLING SHAREABLE CACHE MEMORY THEREOF | 06-02-2016 |
20160170880 | MULTICAST TREE-BASED DATA DISTRIBUTION IN DISTRIBUTED SHARED CACHE | 06-16-2016 |
20160179669 | Method, Apparatus And Computer Programs Providing Cluster-Wide Page Management | 06-23-2016 |
20160179686 | MEMORY MANAGEMENT METHOD FOR SUPPORTING SHARED VIRTUAL MEMORIES WITH HYBRID PAGE TABLE UTILIZATION AND RELATED MACHINE READABLE MEDIUM | 06-23-2016 |
20160378541 | ADDRESS PROBING FOR TRANSACTION - Embodiments relate to address probing for a transaction. An aspect includes determining, before starting execution of a transaction, a plurality of addresses that will be used by the transaction during execution. Another aspect includes probing each address of the plurality of addresses to determine whether any of the plurality of addresses has an address conflict. Yet another aspect includes, based on determining that none of the plurality of addresses has an address conflict, starting execution of the transaction. | 12-29-2016 |
20170235753 | MASTER/SLAVE COMPRESSION ENGINE | 08-17-2017 |
20190146916 | CONCURRENT MODIFICATION OF SHARED CACHE LINE BY MULTIPLE PROCESSORS | 05-16-2019 |
20190146921 | IDENTIFICATION OF A COMPUTING DEVICE ACCESSING A SHARED MEMORY | 05-16-2019 |