Derek E. Williams, Austin US

Derek E. Williams, Austin, TX US

Patent application number	Description	Published
20080201533	REDUCING NUMBER OF REJECTED SNOOP REQUESTS BY EXTENDING TIME TO RESPOND TO SNOOP REQUEST - A cache, system and method for reducing the number of rejected snoop requests. An incoming snoop request is entered in the first available latch in a pipeline of latches in a stall/reorder unit if the stall/reorder unit is not full. The entered snoop request is dispatched to a selector upon entering a bottom latch in the pipeline. The stall/reorder unit is not informed as to whether the dispatched snoop request is accepted by an arbitration mechanism for several clock cycles after the dispatch occurred. A copy of the dispatched snoop request is stored in a top latch in an overrun pipeline of latches in the first unit upon dispatching the snoop request. By maintaining information about the snoop request, the snoop request may be dispatched again to the selector in case the dispatched snoop request was rejected thereby increasing the chance that the snoop request will ultimately be accepted.	08-21-2008
20080201534	REDUCING NUMBER OF REJECTED SNOOP REQUESTS BY EXTENDING TIME TO RESPOND TO SNOOP REQUEST - A cache, system and method for reducing the number of rejected snoop requests. A “stall/reorder unit” in a cache receives a snoop request from an interconnect. The snoop request is entered in the first available latch of the stall/reorder unit unless the stall/reorder unit is full in which case the new snoop request is transmitted to a second unit configured to transmit a request to retry resending the new snoop request. Snoop requests have a higher priority than requests from processors and snoop requests are selected by the arbitration mechanism over processor requests unless the arbitration mechanism requests otherwise (“stall request”) to the stall/reorder unit. By snoop requests having a higher priority than processor requests, the number of snoop requests rejected is reduced. By having the arbitration mechanism issue a stall request, the processor will not be starved.	08-21-2008
20080215824	CACHE MEMORY, PROCESSING UNIT, DATA PROCESSING SYSTEM AND METHOD FOR FILTERING SNOOPED OPERATIONS - A cache coherent data processing system includes at least a first cache memory supporting a first processing unit and a second cache memory supporting a second processing unit. The first cache memory includes a cache array and a cache directory of contents of the cache array. In response to the first cache memory detecting on an interconnect a broadcast operation that specifies a request address, the first cache memory determines from the operation a type of the operation and a coherency state associated with the request address. In response to determining the type and the coherency state, the first cache memory filters out the broadcast operation without accessing the cache directory.	09-04-2008
20080216044	METHOD, SYSTEM AND PROGRAM PRODUCT FOR SPECIFYING A CONFIGURATION FOR A DIGITAL SYSTEM UTILIZING DIAL BIASING WEIGHTS - In a method of data processing, a database defines a Dial entity and at least one instance of the Dial entity. Each instance of the Dial entity has an input having a plurality of different possible input values and one or more outputs, and each of the plurality of different possible input values has a different associated output value set for the one or more outputs. Each instance of the Dial entity determines a value of at least one of a plurality of configuration latches in a digital system separate from the database. The database also associates with the Dial entity at least one set of biasing weights that, when applied, determines a probability of each instance of the Dial entity having particular ones of the plurality of different possible input values. In response to a call to set the plurality of configuration latches, the database is accessed to apply the at least one set of biasing weights to select one of the plurality of different possible input values for the at least one instance of the Dial entity. The plurality of configuration latches in the digital system are set based upon the output value set for the one or more outputs of the at least one instance of the Dial entity.	09-04-2008
20080229193	PROGRAM PRODUCT SUPPORTING SPECIFICATION OF SIGNALS FOR SIMULATION RESULT VIEWING - According to a method of data processing, a data set including at least one entry specifying a signal group by a predetermined signal group name is received by a data processing system. In response to receipt of the data set, the entry in the data set is processed to identify the signal group name. Signal group information associated with an event trace file containing simulation results is accessed to determine signal names of multiple signals that are members of the signal group. Simulation results from the event trace file that are associated with instances of said multiple signals are then included within a presentation of simulation results.	09-18-2008
20080281571	METHOD, SYSTEM AND PROGRAM PRODUCT SUPPORTING SEQUENTIAL LOGIC IN SIMULATION INSTRUMENTATION OF AN ELECTRONIC SYSTEM - According to a method of simulation processing, a collection of files including one or more HDL source files describing design entities collectively representing a digital design to be simulated is received. The HDL source file(s) include a statement specifying inclusion of an instrumentation entity not forming a portion of the digital design but enabling observation of its operation during simulation. The instrumentation entity includes sequential logic containing at least one storage element, where the instrumentation entity has an output signal indicative of occurrence of a simulation event. The collection of files is processed to obtain an instrumented simulation executable model. The processing includes instantiating at least one instance of each of the plurality of design entities and instantiating the instrumentation entity. The processing further includes instantiating external instrumentation logic, logically coupled to each instance of the instrumentation entity, to record occurrences of the event.	11-13-2008
20080294413	PROGRAM PRODUCT SUPPORTING PHASE EVENTS IN A SIMULATION MODEL OF A DIGITAL SYSTEM - According to a method of simulation processing, an instrumented simulation executable model of a design is built by compiling one or more hardware description language (HDL) files specifying one or more design entities within the design and one or more instrumentation entities and instantiating instances of the one or more instrumentation entities within instances of the one or more design entities. Operation of the design is then simulated utilizing the instrumented simulation executable model. Simulating operation includes each of multiple instantiations of the one or more instrumentation entities generating a respective external phase signal representing an occurrence of a particular phase of operation and instrumentation combining logic generating from external phase signals of the multiple instantiations of the one or more instrumentation entities an aggregate phase signal representing an occurrence of the particular phase.	11-27-2008
20080301377	DATA PROCESSING SYSTEM, CACHE SYSTEM AND METHOD FOR UPDATING AN INVALID COHERENCY STATE IN RESPONSE TO SNOOPING AN OPERATION - A cache coherent data processing system includes at least first and second coherency domains. In a first cache memory within the first coherency domain of the data processing system, a coherency state field associated with a storage location and an address tag is set to a first data-invalid coherency state that indicates that the address tag is valid and that the storage location does not contain valid data. In response to snooping an exclusive access operation, the exclusive access request specifying a target address matching the address tag and indicating a relative domain location of a requestor that initiated the exclusive access operation, the first cache memory updates the coherency state field from the first data-invalid coherency state to a second data-invalid coherency state that indicates that the address tag is valid, that the storage location does not contain valid data, and whether a target memory block associated with the address tag is cached within the first coherency domain upon successful completion of the exclusive access operation based upon the relative location of the requestor.	12-04-2008
20090006766	DATA PROCESSING SYSTEM AND METHOD FOR PREDICTIVELY SELECTING A SCOPE OF BROADCAST OF AN OPERATION UTILIZING A HISTORY-BASED PREDICTION - According to a method of data processing, a predictor is maintained that indicates a historical scope of broadcast for one or more previous operations transmitted on an interconnect of a data processing system. A scope of broadcast of a subsequent operation is predictively selected by reference to the predictor.	01-01-2009
20090112552	Method, System and Program Product for Reporting Temporal Information Regarding Count Events of a Simulation - At a simulation client, a design is simulated utilizing a hardware description language (HDL) simulation model by stimulating the HDL simulation model with a testcase. The HDL simulation model includes instrumentation not forming a portion of the design that includes a plurality of count event counters that count occurrences of count events in the design during stimulation by the testcase. At multiple intervals during stimulation of the HDL simulation model by the testcase, the simulation client records count values of the plurality of count event counters. The simulation client determines, for each of the multiple intervals, a temporal statistic regarding the count values of the plurality of count event counters and outputs a report containing temporal statistics for the multiple intervals.	04-30-2009
20090112561	Method, System and Program Product for Defining and Recording Threshold-Qualified Count Events of a Simulation By Testcases - A design is simulated utilizing a hardware description language (HDL) simulation model by stimulating the HDL simulation model with a testcase. The HDL simulation model includes instrumentation not forming a portion of the design that includes a count event counter for a count event in the design, and the simulation includes counting occurrences of the count event in the count event counter to obtain a count event value. A threshold is also established for an aggregate count event value for the count event counter. After completion of the testcase, a determination is made whether addition of the count event value to the aggregate count event value for the count event counter would cause the aggregate count event value to exceed the threshold. If not, the count event value is recorded in a testcase data storage area, and the count event value is accumulated in the aggregate count event value. If so, the count event value is discarded without recording the count event value in the testcase data storage area.	04-30-2009
20090198865	DATA PROCESSING SYSTEM, PROCESSOR AND METHOD THAT PERFORM A PARTIAL CACHE LINE STORAGE-MODIFYING OPERATION BASED UPON A HINT - In at least one embodiment, a method of data processing in a data processing system having a memory hierarchy includes a processor core executing a storage-modifying memory access instruction to determine a memory address. The processor core transmits to a cache memory within the memory hierarchy a storage-modifying memory access request including the memory address, an indication of a memory access type, and, if present, a partial cache line hint signaling access to less than all granules of a target cache line of data associated with the memory address. In response to the storage-modifying memory access request, the cache memory performs a storage-modifying access to all granules of the target cache line of data if the partial cache line hint is not present and performs a storage-modifying access to less than all granules of the target cache line of data if the partial cache line hint is present.	08-06-2009
20100153083	SELECTIVE COMPILATION OF A SIMULATION MODEL IN VIEW OF UNAVAILABLE HIGHER LEVEL SIGNALS - In response to receiving HDL file(s) that specify a plurality of hierarchically arranged design entities defining a design to be simulated and that specify an instrumentation entity for monitoring simulated operation of the design, an instrumented simulation executable model of the design is built. Building the model includes compiling the HDL file(s) specifying the plurality of hierarchically arranged design entities defining the design and instantiating at least one instance of each of the plurality of hierarchically arranged design entities, and further includes instantiating an instance of the instrumentation entity within an instance of a particular design entity among the plurality of design entities and, based upon a reference in an instrumentation statement in the one or more HDL files, logically attaching an input of the instance of the instrumentation entity to an input source within the design that is outside the scope of the particular design entity.	06-17-2010
20100153647	Cache-To-Cache Cast-In - A data processing system includes a first processing unit and a second processing unit coupled by an interconnect fabric. The first processing unit has a first processor core and associated first upper and first lower level caches, and the second processing unit has a second processor core and associated second upper and lower level caches. In response to a data request, a victim cache line is selected for castout from the first lower level cache. The first processing unit issues on the interconnect fabric a lateral castout (LCO) command that identifies the victim cache line to be castout from the first lower level cache and indicates that a lower level cache is an intended destination. In response to a coherence response indicating success of the LCO command, the victim cache line is removed from the first lower level cache and held in the second lower level cache.	06-17-2010
20100153898	MODEL BUILD IN THE PRESENCE OF A NON-BINDING REFERENCE - One or more hardware description language (HDL) files describe a plurality of hierarchically arranged design entities defining a digital design to be simulated and a plurality of configuration entities not belonging to the digital design that logically control settings of a plurality of configuration latches in the digital design. The HDL file(s) are compiled to obtain a simulation executable model of the digital design and an associated configuration database. The compiling includes parsing a configuration statement that specifies an association between an instance of a configuration entity and a specified configuration latch, determining whether or not the specified configuration latch is described in the HDL file(s), and if not, creating an indication in the configuration database that the instance of the configuration latch had a specified association to a configuration latch to which it failed to bind.	06-17-2010
20100235576	Handling Castout Cache Lines In A Victim Cache - A victim cache memory includes a cache array, a cache directory of contents of the cache array, and a cache controller that controls operation of the victim cache memory. The cache controller, responsive to receiving a castout command identifying a victim cache line castout from another cache memory, causes the victim cache line to be held in the cache array. If the other cache memory is a higher level cache in the cache hierarchy of the processor core, the cache controller marks the victim cache line in the cache directory so that it is less likely to be evicted by a replacement policy of the victim cache, and otherwise, marks the victim cache line in the cache directory so that it is more likely to be evicted by the replacement policy of the victim cache.	09-16-2010
20100235577	VICTIM CACHE LATERAL CASTOUT TARGETING - A data processing system includes a plurality of processing units coupled by an interconnect fabric. In response to a data request, a victim cache line is selected for castout from a first lower level cache of a first processing unit, and a target lower level cache of one of the plurality of processing units is selected based upon architectural proximity of the target lower level cache to a home system memory to which the address of the victim cache line is assigned. The first processing unit issues on the interconnect fabric a lateral castout (LCO) command that identifies the victim cache line to be castout from the first lower level cache and indicates that the target lower level cache is an intended destination. In response to a coherence response indicating success of the LCO command, the victim cache line is removed from the first lower level cache and held in the second lower level cache.	09-16-2010
20100235584	Lateral Castout (LCO) Of Victim Cache Line In Data-Invalid State - A victim cache line having a data-invalid coherence state is selected for castout from a first lower level cache of a first processing unit. The first processing unit issues on an interconnect fabric a lateral castout (LCO) command identifying the victim cache line to be castout from the first lower level cache, indicating the data-invalid coherence state, and indicating that a lower level cache is an intended destination of the victim cache line. In response to a coherence response to the LCO command indicating success of the LCO command, the victim cache line is removed from the first lower level cache and held in a second lower level cache of a second processing unit in the data-invalid coherence state.	09-16-2010
20100262786	Barriers Processing in a Multiprocessor System Having a Weakly Ordered Storage Architecture Without Broadcast of a Synchronizing Operation - A data processing system employing a weakly ordered storage architecture includes first and second sets of processing units coupled to each other and data storage by an interconnect fabric. Each processing unit has a processor core having an associated cache hierarchy including at least a level one, level two and level three cache memories. In response to a request to perform an update to a portion of a first image of memory contained in the level three cache memory of a first processing unit while at last one kill-type command is pending at the first processing unit, the cache hierarchy of the first processing unit permitting the update to be exposed to any first processor core only after the at least one kill-type command is complete.	10-14-2010
20100268884	Updating Partial Cache Lines in a Data Processing System - A processing unit for a data processing system includes a processor core having one or more execution units for processing instructions and a register file for storing data accessed in processing of the instructions. The processing unit also includes a multi-level cache hierarchy coupled to and supporting the processor core. The multi-level cache hierarchy includes at least one upper level of cache memory having a lower access latency and at least one lower level of cache memory having a higher access latency. The lower level of cache memory, responsive to receipt of a memory access request that hits only a partial cache line in the lower level cache memory, sources the partial cache line to the at least one upper level cache memory to service the memory access request. The at least one upper level cache memory services the memory access request without caching the partial cache line.	10-21-2010
20110047352	MEMORY COHERENCE DIRECTORY SUPPORTING REMOTELY SOURCED REQUESTS OF NODAL SCOPE - A data processing system includes at least a first through third processing nodes coupled by an interconnect fabric. The first processing node includes a master, a plurality of snoopers capable of participating in interconnect operations, and a node interface that receives a request of the master and transmits the request of the master to the second processing unit with a nodal scope of transmission limited to the second processing node. The second processing node includes a node interface having a directory. The node interface of the second processing node permits the request to proceed with the nodal scope of transmission if the directory does not indicate that a target memory block of the request is cached other than in the second processing node and prevents the request from succeeding if the directory indicates that the target memory block of the request is cached other than in the second processing node.	02-24-2011
20110153943	Aggregate Data Processing System Having Multiple Overlapping Synthetic Computers - A first SMP computer has first and second processing units and a first system memory pool, a second SMP computer has third and fourth processing units and a second system memory pool, and a third SMP computer has at least fifth and sixth processing units and third, fourth and fifth system memory pools. The fourth system memory pool is inaccessible to the third, fourth and sixth processing units and accessible to at least the second and fifth processing units, and the fifth system memory pool is inaccessible to the first, second and sixth processing units and accessible to at least the fourth and fifth processing units. A first interconnect couples the second processing unit for load-store coherent, ordered access to the fourth system memory pool, and a second interconnect couples the fourth processing unit for load-store coherent, ordered access to the fifth system memory pool.	06-23-2011
20110161587	PROACTIVE PREFETCH THROTTLING - According to a method of data processing, a memory controller receives a plurality of data prefetch requests from multiple processor cores in the data processing system, where the plurality of prefetch load requests include a data prefetch request issued by a particular processor core among the multiple processor cores. In response to receipt of the data prefetch request, the memory controller provides a coherency response indicating an excess number of data prefetch requests. In response to the coherency response, the particular processor core reduces a rate of issuance of data prefetch requests.	06-30-2011
20110161588	FORMATION OF AN EXCLUSIVE OWNERSHIP COHERENCE STATE IN A LOWER LEVEL CACHE - In response to a memory access request of a processor core that targets a target cache line, the lower level cache of a vertical cache hierarchy associated with the processor core supplies a copy of the target cache line to an upper level cache in the vertical cache hierarchy and retains a copy in a shared coherence state. The upper level cache holds the copy of the target cache line in a private shared ownership coherence state indicating that each cached copy of the target memory block is cached within the vertical cache hierarchy associated with the processor core. In response to the upper level cache signaling replacement of the copy of the target cache line in the private shared ownership coherence state, the lower level cache updates its copy of the target cache line to the exclusive ownership coherence state without coherency messaging with other vertical cache hierarchies.	06-30-2011
20110161589	SELECTIVE CACHE-TO-CACHE LATERAL CASTOUTS - A data processing system includes first and second processing units and a system memory. The first processing unit has first upper and first lower level caches, and the second processing unit has second upper and lower level caches. In response to a data request, a victim cache line to be castout from the first lower level cache is selected, and the first lower level cache selects between performing a lateral castout (LCO) of the victim cache line to the second lower level cache and a castout of the victim cache line to the system memory based upon a confidence indicator associated with the victim cache line. In response to selecting an LCO, the first processing unit issues an LCO command on the interconnect fabric and removes the victim cache line from the first lower level cache, and the second lower level cache holds the victim cache line.	06-30-2011
20110161590	SYNCHRONIZING ACCESS TO DATA IN SHARED MEMORY VIA UPPER LEVEL CACHE QUEUING - A processing unit includes a store-in lower level cache having reservation logic that determines presence or absence of a reservation and a processor core including a store-through upper level cache, an instruction execution unit, a load unit that, responsive to a hit in the upper level cache on a load-reserve operation generated through execution of a load-reserve instruction by the instruction execution unit, temporarily buffers a load target address of the load-reserve operation, and a flag indicating that the load-reserve operation bound to a value in the upper level cache. If a storage-modifying operation is received that conflicts with the load target address of the load-reserve operation, the processor core sets the flag to a particular state, and, responsive to execution of a store-conditional instruction, transmits an associated store-conditional operation to the lower level cache with a fail indication if the flag is set to the particular state.	06-30-2011
20120179876	Cache-Based Speculation of Stores Following Synchronizing Operations - A method of processing store requests in a data processing system includes enqueuing a store request in a store queue of a cache memory of the data processing system. The store request identifies a target memory block by a target address and specifies store data. While the store request and a barrier request older than the store request are enqueued in the store queue, a read-claim machine of the cache memory is dispatched to acquire coherence ownership of target memory block of the store request. After coherence ownership of the target memory block is acquired and the barrier request has been retired from the store queue, a cache array of the cache memory is updated with the store data.	07-12-2012
20120198167	SYNCHRONIZING ACCESS TO DATA IN SHARED MEMORY VIA UPPER LEVEL CACHE QUEUING - A processing unit includes a store-in lower level cache having reservation logic that determines presence or absence of a reservation and a processor core including a store-through upper level cache, an instruction execution unit, a load unit that, responsive to a hit in the upper level cache on a load-reserve operation generated through execution of a load-reserve instruction by the instruction execution unit, temporarily buffers a load target address of the load-reserve operation, and a flag indicating that the load-reserve operation bound to a value in the upper level cache. If a storage-modifying operation is received that conflicts with the load target address of the load-reserve operation, the processor core sets the flag to a particular state, and, responsive to execution of a store-conditional instruction, transmits an associated store-conditional operation to the lower level cache with a fail indication if the flag is set to the particular state.	08-02-2012
20120203973	SELECTIVE CACHE-TO-CACHE LATERAL CASTOUTS - A data processing system includes first and second processing units and a system memory. The first processing unit has first upper and first lower level caches, and the second processing unit has second upper and lower level caches. In response to a data request, a victim cache line to be castout from the first lower level cache is selected, and the first lower level cache selects between performing a lateral castout (LCO) of the victim cache line to the second lower level cache and a castout of the victim cache line to the system memory based upon a confidence indicator associated with the victim cache line. In response to selecting an LCO, the first processing unit issues an LCO command on the interconnect fabric and removes the victim cache line from the first lower level cache, and the second lower level cache holds the victim cache line.	08-09-2012
20120203976	MEMORY COHERENCE DIRECTORY SUPPORTING REMOTELY SOURCED REQUESTS OF NODAL SCOPE - A data processing system includes at least a first through third processing nodes coupled by an interconnect fabric. The first processing node includes a master, a plurality of snoopers capable of participating in interconnect operations, and a node interface that receives a request of the master and transmits the request of the master to the second processing unit with a nodal scope of transmission limited to the second processing node. The second processing node includes a node interface having a directory. The node interface of the second processing node permits the request to proceed with the nodal scope of transmission if the directory does not indicate that a target memory block of the request is cached other than in the second processing node and prevents the request from succeeding if the directory indicates that the target memory block of the request is cached other than in the second processing node.	08-09-2012
20120210072	CACHE-BASED SPECULATION OF STORES FOLLOWING SYNCHRONIZING OPERATIONS - A method of processing store requests in a data processing system includes enqueuing a store request in a store queue of a cache memory of the data processing system. The store request identifies a target memory block by a target address and specifies store data. While the store request and a barrier request older than the store request are enqueued in the store queue, a read-claim machine of the cache memory is dispatched to acquire coherence ownership of target memory block of the store request. After coherence ownership of the target memory block is acquired and the barrier request has been retired from the store queue, a cache array of the cache memory is updated with the store data.	08-16-2012
20120265938	PERFORMING A PARTIAL CACHE LINE STORAGE-MODIFYING OPERATION BASED UPON A HINT - Analyzing pre-processed code includes identifying at least one storage-modifying construct specifying a storage-modifying memory access to a memory hierarchy of a data processing system and determining if more than one granule of a cache line of data containing multiple granules that is targeted by the storage-modifying construct is subsequently referenced by said pre-processed code. Post-processed code including a storage-modifying instruction corresponding to the at least one storage-modifying construct in the pre-processed code is generated and stored. Generating the post-processed code includes marking the storage-modifying instruction with a partial cache line hint indicating that said storage-modifying instruction targets less than a full cache line of data within a memory hierarchy if the analyzing indicates only one granule of the target cache line will be accessed while the cache line is held in the cache memory and otherwise refraining from marking the storage-modifying instruction with the partial cache line hint.	10-18-2012
20120296877	FACILITATING DATA COHERENCY USING IN-MEMORY TAG BITS AND TAG TEST INSTRUCTIONS - Fine-grained detection of data modification of original data is provided by associating separate guard bits with granules of memory storing original data from which translated data has been obtained. The guard bits indicating whether the original data stored in the associated granule is protected for data coherency. The guard bits are set and cleared by special-purpose instructions. Responsive to attempting access to translated data obtained from the original data, the guard bit(s) associated with the original data is checked to determine whether the guard bit(s) fail to indicate coherency of the original data, and if so, discarding of the translated data is initiated to facilitate maintaining data coherency between the original data and the translated data.	11-22-2012
20120297109	FACILITATING DATA COHERENCY USING IN-MEMORY TAG BITS AND FAULTING STORES - Fine-grained detection of data modification of original data is provided by associating separate guard bits with granules of memory storing the original data from which translated data has been obtained. The guard bits facilitate indicating whether the original data stored in the associated granule is indicated as protected. The guard bits are set and cleared by special-purpose instructions. Responsive to initiating a data store operation to modify the original data, the associated guard bit(s) are checked to determine whether the original data is indicated as protected. Responsive to the checking indicating that a guard bit is set for the associated original data, the data store operation to modify the original data is faulted and the translated data is discarded, thereby facilitating data coherency between the original data and the translated data.	11-22-2012
20120297146	FACILITATING DATA COHERENCY USING IN-MEMORY TAG BITS AND TAG TEST INSTRUCTIONS - A method is provided for fine-grained detection of data modification of original data by associating separate guard bits with granules of memory storing original data from which translated data has been obtained. The guard bits indicating whether the original data stored in the associated granule is protected for data coherency. The guard bits are set and cleared by special-purpose instructions. Responsive to attempting access to translated data obtained from the original data, the guard bit(s) associated with the original data is checked to determine whether the guard bit(s) fail to indicate coherency of the original data, and if so, discarding of the translated data is initiated to facilitate maintaining data coherency between the original data and the translated data.	11-22-2012
20130013899	Using Hardware Transaction Primitives for Implementing Non-Transactional Escape Actions Inside Transactions - Mechanisms are provided for performing escape actions within transactions. These mechanisms execute a transaction comprising a transactional section and an escape action. The transactional section is comprised of one or more instructions that are to be executed in an atomic manner as part of the transaction. The escape action is comprised of one or more instructions to be executed in a non-transactional manner. These mechanisms further populate at least one actions list data structure, associated with a thread of the data processing system that is executing the transaction, with one or more actions associated with the escape action. Moreover, these mechanisms execute one or more actions in the actions list data structure based upon whether the transaction commits successfully or is aborted.	01-10-2013
20130205087	FORWARD PROGRESS MECHANISM FOR STORES IN THE PRESENCE OF LOAD CONTENTION IN A SYSTEM FAVORING LOADS - A multiprocessor data processing system includes a plurality of cache memories including a cache memory. In response to the cache memory detecting a storage-modifying operation specifying a same target address as that of a first read-type operation being processed by the cache memory, the cache memory provides a retry response to the storage-modifying operation. In response to completion of the read-type operation, the cache memory enters a referee mode. While in the referee mode, the cache memory temporarily dynamically increases priority of any storage-modifying operation targeting the target address in relation to any second read-type operation targeting the target address.	08-08-2013
20130205096	FORWARD PROGRESS MECHANISM FOR STORES IN THE PRESENCE OF LOAD CONTENTION IN A SYSTEM FAVORING LOADS BY STATE ALTERATION - A multiprocessor data processing system includes a plurality of cache memories including a cache memory. The cache memory issues a read-type operation for a target cache line. While waiting for receipt of the target cache line, the cache memory monitors to detect a competing store-type operation for the target cache line. In response to receiving the target cache line, the cache memory installs the target cache line in the cache memory, and sets a coherency state of the target cache line installed in the cache memory based on whether the competing store-type operation is detected.	08-08-2013
20130205098	FORWARD PROGRESS MECHANISM FOR STORES IN THE PRESENCE OF LOAD CONTENTION IN A SYSTEM FAVORING LOADS - A multiprocessor data processing system includes a plurality of cache memories including a cache memory. In response to the cache memory detecting a storage-modifying operation specifying a same target address as that of a first read-type operation being processed by the cache memory, the cache memory provides a retry response to the storage-modifying operation. In response to completion of the read-type operation, the cache memory enters a referee mode. While in the referee mode, the cache memory temporarily dynamically increases priority of any storage-modifying operation targeting the target address in relation to any second read-type operation targeting the target address.	08-08-2013
20130205099	FORWARD PROGRESS MECHANISM FOR STORES IN THE PRESENCE OF LOAD CONTENTION IN A SYSTEM FAVORING LOADS BY STATE ALTERATION - A multiprocessor data processing system includes a plurality of cache memories including a cache memory. The cache memory issues a read-type operation for a target cache line. While waiting for receipt of the target cache line, the cache memory monitors to detect a competing store-type operation for the target cache line. In response to receiving the target cache line, the cache memory installs the target cache line in the cache memory, and sets a coherency state of the target cache line installed in the cache memory based on whether the competing store-type operation is detected.	08-08-2013
20130205120	PROCESSOR PERFORMANCE IMPROVEMENT FOR INSTRUCTION SEQUENCES THAT INCLUDE BARRIER INSTRUCTIONS - A technique for processing an instruction sequence that includes a barrier instruction, a load instruction preceding the barrier instruction, and a subsequent memory access instruction following the barrier instruction includes determining that the load instruction is resolved based upon receipt of an earliest of a good combined response for a read operation corresponding to the load instruction and data for the load instruction. The technique also includes if execution of the subsequent memory access instruction is not initiated prior to completion of the barrier instruction, initiating in response to determining the barrier instruction completed, execution of the subsequent memory access instruction. The technique further includes if execution of the subsequent memory access instruction is initiated prior to completion of the barrier instruction, discontinuing in response to determining the barrier instruction completed, tracking of the subsequent memory access instruction with respect to invalidation.	08-08-2013
20130205121	PROCESSOR PERFORMANCE IMPROVEMENT FOR INSTRUCTION SEQUENCES THAT INCLUDE BARRIER INSTRUCTIONS - A technique for processing an instruction sequence that includes a barrier instruction, a load instruction preceding the barrier instruction, and a subsequent memory access instruction following the barrier instruction includes determining, by a processor core, that the load instruction is resolved based upon receipt by the processor core of an earliest of a good combined response for a read operation corresponding to the load instruction and data for the load instruction. The technique also includes if execution of the subsequent memory access instruction is not initiated prior to completion of the barrier instruction, initiating by the processor core, in response to determining the barrier instruction completed, execution of the subsequent memory access instruction. The technique further includes if execution of the subsequent memory access instruction is initiated prior to completion of the barrier instruction, discontinuing by the processor core, in response to determining the barrier instruction completed, tracking of the subsequent memory access instruction with respect to invalidation.	08-08-2013
20130262769	DATA CACHE BLOCK DEALLOCATE REQUESTS - A data processing system includes a processor core supported by upper and lower level caches. In response to executing a deallocate instruction in the processor core, a deallocation request is sent from the processor core to the lower level cache, the deallocation request specifying a target address associated with a target cache line. In response to receipt of the deallocation request at the lower level cache, a determination is made if the target address hits in the lower level cache. In response to determining that the target address hits in the lower level cache, the target cache line is retained in a data array of the lower level cache and a replacement order field in a directory of the lower level cache is updated such that the target cache line is more likely to be evicted from the lower level cache in response to a subsequent cache miss.	10-03-2013
20130262770	DATA CACHE BLOCK DEALLOCATE REQUESTS IN A MULTI-LEVEL CACHE HIERARCHY - In response to executing a deallocate instruction, a deallocation request specifying a target address of a target cache line is sent from a processor core to a lower level cache. In response, a determination is made if the target address hits in the lower level cache. If so, the target cache line is retained in a data array of the lower level cache, and a replacement order field of the lower level cache is updated such that the target cache line is more likely to be evicted in response to a subsequent cache miss in a congruence class including the target cache line. In response to the subsequent cache miss, the target cache line is cast out to the lower level cache with an indication that the target cache line was a target of a previous deallocation request of the processor core.	10-03-2013
20130262777	DATA CACHE BLOCK DEALLOCATE REQUESTS - A data processing system includes a processor core supported by upper and lower level caches. In response to executing a deallocate instruction in the processor core, a deallocation request is sent from the processor core to the lower level cache, the deallocation request specifying a target address associated with a target cache line. In response to receipt of the deallocation request at the lower level cache, a determination is made if the target address hits in the lower level cache. In response to determining that the target address hits in the lower level cache, the target cache line is retained in a data array of the lower level cache and a replacement order field in a directory of the lower level cache is updated such that the target cache line is more likely to be evicted from the lower level cache in response to a subsequent cache miss.	10-03-2013
20130262778	DATA CACHE BLOCK DEALLOCATE REQUESTS IN A MULTI-LEVEL CACHE HIERARCHY - In response to executing a deallocate instruction, a deallocation request specifying a target address of a target cache line is sent from a processor core to a lower level cache. In response, a determination is made if the target address hits in the lower level cache. If so, the target cache line is retained in a data array of the lower level cache, and a replacement order field of the lower level cache is updated such that the target cache line is more likely to be evicted in response to a subsequent cache miss in a congruence class including the target cache line. In response to the subsequent cache miss, the target cache line is cast out to the lower level cache with an indication that the target cache line was a target of a previous deallocation request of the processor core.	10-03-2013
20140013055	ENSURING CAUSALITY OF TRANSACTIONAL STORAGE ACCESSES INTERACTING WITH NON-TRANSACTIONAL STORAGE ACCESSES - A data processing system implements a weak consistency memory model for a distributed shared memory system. The data processing system concurrently executes, on a plurality of processor cores, one or more transactional memory instructions within a memory transaction and one or more non-transactional memory instructions. The one or more non-transactional memory instructions include a non-transactional store instruction. The data processing system commits the memory transaction to the distributed shared memory system only in response to enforcement of causality of the non-transactional store instruction with respect to the memory transaction.	01-09-2014
20140013060	ENSURING CAUSALITY OF TRANSACTIONAL STORAGE ACCESSES INTERACTING WITH NON-TRANSACTIONAL STORAGE ACCESSES - A data processing system implements a weak consistency memory model for a distributed shared memory system. The data processing system concurrently executes, on a plurality of processor cores, one or more transactional memory instructions within a memory transaction and one or more non-transactional memory instructions. The one or more non-transactional memory instructions include a non-transactional store instruction. The data processing system commits the memory transaction to the distributed shared memory system only in response to enforcement of causality of the non-transactional store instruction with respect to the memory transaction.	01-09-2014
20140040551	REWIND ONLY TRANSACTIONS IN A DATA PROCESSING SYSTEM SUPPORTING TRANSACTIONAL STORAGE ACCESSES - In a multiprocessor data processing system having a distributed shared memory system, a memory transaction that is a rewind-only transaction (ROT) and that includes one or more transactional memory access instructions and a transactional abort instruction is executed. In response to execution of the one or more transactional memory access instructions, one or more memory accesses to the distributed shared memory system indicated by the one or more transactional memory access instructions are performed. In response to execution of the transactional abort instruction, execution results of the one or more transaction memory access instructions are discarded and control is passed to a fail handler.	02-06-2014
20140040557	NESTED REWIND ONLY AND NON REWIND ONLY TRANSACTIONS IN A DATA PROCESSING SYSTEM SUPPORTING TRANSACTIONAL STORAGE ACCESSES - In a multiprocessor data processing system having a distributed shared memory system, first and second nested memory transactions are executed, where the first memory transaction is a rewind-only transaction (ROT) and the second memory transaction is a non-ROT memory transaction. The first memory transaction has a transaction body including the second memory transaction and an additional plurality of transactional memory access instructions. In response to execution of the transactional memory access instructions, memory accesses are performed to the distributed shared memory system. Conflicts between memory accesses not within the first memory transaction and at least a load footprint of any of the transactional memory access instructions preceding the second memory transaction are not tracked. However, conflicts between memory accesses not within the first memory transaction and store and load footprints of any of the transactional memory access instructions that follow initiation the second memory transaction are tracked.	02-06-2014
20140047195	TRANSACTION CHECK INSTRUCTION FOR MEMORY TRANSACTIONS - A processing unit of a data processing system having a shared memory system executes a memory transaction including a transactional store instruction that causes a processing unit of the data processing system to make a conditional update to a target memory block of the shared memory system conditioned on successful commitment of the memory transaction. The memory transaction further includes a transaction check instruction. In response to executing the transaction check instruction, the processing unit determines, prior to conclusion of the memory transaction, whether the target memory block of the shared memory system was modified after the conditional update caused by execution of the transactional store instruction. In response to determining that the target memory block has been modified, a condition register within the processing unit is set to indicate a conflict for the memory transaction.	02-13-2014
20140047196	TRANSACTION CHECK INSTRUCTION FOR MEMORY TRANSACTIONS - A processing unit of a data processing system having a shared memory system executes a memory transaction including a transactional store instruction that causes a processing unit of the data processing system to make a conditional update to a target memory block of the shared memory system conditioned on successful commitment of the memory transaction. The memory transaction further includes a transaction check instruction. In response to executing the transaction check instruction, the processing unit determines, prior to conclusion of the memory transaction, whether the target memory block of the shared memory system was modified after the conditional update caused by execution of the transactional store instruction. In response to determining that the target memory block has been modified, a condition register within the processing unit is set to indicate a conflict for the memory transaction.	02-13-2014
20140047205	INTERACTION OF TRANSACTIONAL STORAGE ACCESSES WITH OTHER ATOMIC SEMANTICS - In a processor, an instruction sequence including, in order, a load-and-reserve instruction specifying a read access to a target memory block, an instruction delimiting transactional memory access instructions belonging to a memory transaction, and a store-conditional instruction specifying a conditional write access to the target memory block is detected. In response to detecting the instruction sequence, the processor causes the conditional write access to the target memory block to fail.	02-13-2014
20150052311	MANAGEMENT OF TRANSACTIONAL MEMORY ACCESS REQUESTS BY A CACHE MEMORY - In a data processing system having a processor core and a shared memory system including a cache memory that supports the processor core, a transactional memory access request is issued by the processor core in response to execution of a memory access instruction in a memory transaction undergoing execution by the processor core. In response to receiving the transactional memory access request, dispatch logic of the cache memory evaluates the transactional memory access request for dispatch, where the evaluation includes determining whether the memory transaction has a failing transaction state. In response to determining the memory transaction has a failing transaction state, the dispatch logic refrains from dispatching the memory access request for service by the cache memory and refrains from updating at least replacement order information of the cache memory in response to the transactional memory access request.	02-19-2015
20150052312	PROTECTING THE FOOTPRINT OF MEMORY TRANSACTIONS FROM VICTIMIZATION - A processing unit includes a processor core and a cache memory. Entries in the cache memory are grouped in multiple congruence classes. The cache memory includes tracking logic that tracks a transaction footprint including cache line(s) accessed by transactional memory access request(s) of a memory transaction. The cache memory, responsive to receiving a memory access request that specifies a target cache line having a target address that maps to a congruence class, forms a working set of ways in the congruence class containing cache line(s) within the transaction footprint and updates a replacement order of the cache lines in the congruence class. Based on membership of the at least one cache line in the working set, the update promotes at least one cache line that is not the target cache line to a replacement order position in which the at least one cache line is less likely to be replaced.	02-19-2015
20150052313	PROTECTING THE FOOTPRINT OF MEMORY TRANSACTIONS FROM VICTIMIZATION - A processing unit includes a processor core and a cache memory. Entries in the cache memory are grouped in multiple congruence classes. The cache memory includes tracking logic that tracks a transaction footprint including cache line(s) accessed by transactional memory access request(s) of a memory transaction. The cache memory, responsive to receiving a memory access request that specifies a target cache line having a target address that maps to a congruence class, forms a working set of ways in the congruence class containing cache line(s) within the transaction footprint and updates a replacement order of the cache lines in the congruence class. Based on membership of the at least one cache line in the working set, the update promotes at least one cache line that is not the target cache line to a replacement order position in which the at least one cache line is less likely to be replaced.	02-19-2015
20150052315	MANAGEMENT OF TRANSACTIONAL MEMORY ACCESS REQUESTS BY A CACHE MEMORY - In a data processing system having a processor core and a shared memory system including a cache memory that supports the processor core, a transactional memory access request is issued by the processor core in response to execution of a memory access instruction in a memory transaction undergoing execution by the processor core. In response to receiving the transactional memory access request, dispatch logic of the cache memory evaluates the transactional memory access request for dispatch, where the evaluation includes determining whether the memory transaction has a failing transaction state. In response to determining the memory transaction has a failing transaction state, the dispatch logic refrains from dispatching the memory access request for service by the cache memory and refrains from updating at least replacement order information of the cache memory in response to the transactional memory access request.	02-19-2015

Patent applications by Derek E. Williams, Austin, TX US

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Derek E. Williams, Austin US

Derek E. Williams, Austin, TX US