Patent application number | Description | Published |
20120023271 | FULLY ASYNCHRONOUS DIRECT MEMORY ACCESS CONTROLLER AND PROCESSOR WORK - An apparatus generally having a processor and a direct memory access controller is disclosed. The processor may be configured to increment a task counter to indicate that a new one of a plurality of tasks is scheduled. The direct memory access controller may be configured to (i) execute the new task to transfer data between a plurality of memory locations in response to the task counter being incremented and (ii) decrement the task counter in response to the executing of the new task. | 01-26-2012 |
20120042152 | ELIMINATION OF READ-AFTER-WRITE RESOURCE CONFLICTS IN A PIPELINE OF A PROCESSOR - An apparatus having a processor and a circuit is disclosed. The processor generally has a pipeline. The circuit may be configured to (i) detect a first write instruction in the pipeline that writes to a resource, (ii) stall a read instruction in the pipeline where (a) a first read-after-write conflict exists between the first write instruction and the read instruction and (b) no other write instruction to the resource is scheduled between the first write instruction and the read instruction and (iii) not stall the read instruction due to the first read-after-write conflict where a second write instruction to the resource is scheduled between the first write instruction and the read instruction. | 02-16-2012 |
20120059998 | BIT MASK EXTRACT AND PACK FOR BOUNDARY CROSSING DATA - An apparatus having a first circuit and a second circuit is disclosed. The first circuit may be configured to generate a plurality of packed items by extracting-and-packing a plurality of input data words based on a bit mask. The second circuit may be configured to (i) receive the packed items from the first circuit, (ii) sequentially buffer the packed items in a plurality of registers, at least one of the packed items crossing a boundary between a current one of the registers and a next one of the registers, and (iii) write the packed items in the current register to a memory in response to the current register becoming full. | 03-08-2012 |
20120066456 | DIRECT MEMORY ACCESS CACHE PREFETCHING - An apparatus having a first cache and a controller is disclosed. The first cache may be configured to assert a first signal after receiving given information in response to being ready to receive additional information. The controller may be configured to (i) fetch the given information from a memory to the first cache and (ii) prefetch first information in a direct memory access transfer from the memory to the first cache in response to the assertion of the first signal. | 03-15-2012 |
20120096227 | CACHE PREFETCH LEARNING - An apparatus generally having a processor, a cache and a circuit is disclosed. The processor may be configured to generate (i) a plurality of access addresses and (ii) a plurality of program counter values corresponding to the access addresses. The cache may be configured to present in response to the access addresses (i) a plurality of data words and (ii) a plurality of address information corresponding to the data words. The circuit may be configured to record a plurality of events in a file in response to a plurality of cache misses. A first of the events in the file due to a first of the cache misses generally includes (i) a first of the program counter values, (ii) a first of the address information and (iii) a first time to prefetch a first of the data word from a memory to the cache. | 04-19-2012 |
20120110232 | MULTI-DESTINATION DIRECT MEMORY ACCESS TRANSFER - An apparatus generally including an internal memory and a direct memory access controller is disclosed. The direct memory access controller may be configured to (i) read first information from an external memory across an external bus, (ii) generate second information by processing the first information, (iii) write the first information across an internal bus to a first location in the internal memory during a direct memory access transfer and (iv) write the second information across the internal bus to a second location in the internal memory during the direct memory access transfer. The second location may be different from the first location. | 05-03-2012 |
20120151149 | Method and Apparatus for Caching Prefetched Data - A method is provided for performing caching in a processing system including at least one data cache. The method includes the steps of: determining whether each of at least a subset of cache entries stored in the data cache comprises data that has been loaded using fetch ahead (FA); associating an identifier with each cache entry in the subset of cache entries, the identifier indicating whether the cache entry comprises data that has been loaded using FA; and implementing a cache replacement policy for controlling replacement of at least a given cache entry in the data cache with a new cache entry as a function of the identifier associated with the given cache entry. | 06-14-2012 |
20120151150 | Cache Line Fetching and Fetch Ahead Control Using Post Modification Information - A method is provided for performing cache line fetching and/or cache fetch ahead in a processing system including at least one processor core and at least one data cache operatively coupled with the processor. The method includes the steps of: retrieving post modification information from the processor core and a memory address corresponding thereto; and the processing system performing, as a function of the post modification information and the memory address retrieved from the processor core, cache line fetching and/or cache fetch ahead control in the processing system. | 06-14-2012 |
20120324136 | REPRESENTATION OF DATA RELATIVE TO VARYING THRESHOLDS - An apparatus having first and second circuits is disclosed. The first circuit may be disposed on a first side of a bus and configured to store thresholds in a first memory. Each threshold generally represents a respective one of a plurality of regular bit patterns in first data. The first circuit may also be configured to generate second data by representing each respective first data as (i) an index to one of the thresholds and (ii) a difference between the one threshold and the respective first data. A width of the bus may be narrower than the respective first data. The second circuit may be disposed on a second side of the bus and configured to (i) store the thresholds and a plurality of items in a second memory and (ii) reconstruct the first data by adding the respective thresholds to the second data in response to the items. | 12-20-2012 |
20120324195 | ALLOCATION OF PRESET CACHE LINES - An apparatus generally having a cache memory and a circuit is disclosed. The circuit may be configured to (i) parse a single first command received from a processor into a first address and a first value and (ii) allocate a first one of a plurality of lines in the cache memory to a buffer in response to the first command. The first line (a) is generally associated with the first address and (b) may have a plurality of first words. The circuit may be further configured to (iii) preset each of the first words in the first line to the first value. | 12-20-2012 |
20130046961 | SPECULATIVE MEMORY WRITE IN A PIPELINED PROCESSOR - An apparatus generally having an interface circuit and a processor. The interface circuit may have a queue and a connection to a memory. The processor may have a pipeline. The processor is generally configured to (i) place an address in the queue in response to processing a first instruction in a first stage of the pipeline, (ii) generate a flag by processing a second instruction in a second stage of the pipeline, the second instruction may be processed in the second stage after the first instruction is processed in the first stage, and (iii) generate a signal based on the flag in a third stage of the pipeline. The third stage may be situated in the pipeline after the second stage. The interface circuit is generally configured to cancel the address from the queue without transferring the address to the memory in response to the signal having a disabled value. | 02-21-2013 |
20130077464 | ORTHOGONAL VARIABLE SPREADING FACTOR CODE SEQUENCE GENERATION - An apparatus generally having a first circuit and a second circuit is disclosed. The first circuit may be configured to generate (i) a plurality of first code bits in response to an index value and (ii) a plurality of first intermediate bits in response to the index value. The first code bits may be generated in parallel with the first intermediate bits. The second circuit may be configured to generate a plurality of second code bits in response to all of (i) the index value, (ii) the first code bits and (iii) the first intermediate bits. A combination of the first code bits and the second code bits generally forms one of a plurality of orthogonal codes. | 03-28-2013 |
20130080741 | HARDWARE CONTROL OF INSTRUCTION OPERANDS IN A PROCESSOR - An apparatus generally having a first circuit, a second circuit and a third circuit is disclosed. The first circuit may have a counter and may be configured to adjust at least one control signal in response to a current value of the counter. The first circuit may be implemented only in hardware. The counter generally counts a number of loops in which a plurality of instructions are executed. The second circuit may be configured to set the counter to an initial value. The third circuit may be configured to execute the instructions using a plurality of data items as a plurality of operands such that at least two of the instructions use different ones of the operands. The data items may be routed to the third circuit in response to the control signal. The apparatus generally forms a processor. | 03-28-2013 |
20130094567 | APPARATUS AND METHODS FOR PERFORMING BLOCK MATCHING ON A VIDEO STREAM - A data processing system for processing a video stream comprises memory array circuitry, memory access circuitry, and video processing circuitry. The memory array circuitry is characterized by a width and a height. The memory access circuitry is operative to cause, through a series of write operations, a series of two-dimensional data representations of different respective regions in a frame of the video stream to be stored in the memory array circuitry. The write operations occur such that only data missing from the memory array circuitry is written to the memory array circuitry during each write operation and such that the data is written modulo at least one of the width and the height of the memory array circuitry. Lastly, the video processing circuitry is operative to perform block matching on the video stream at least in part utilizing the series of two-dimensional data representations stored in the memory array circuitry. | 04-18-2013 |
20130094586 | Direct Memory Access With On-The-Fly Generation of Frame Information For Unrestricted Motion Vectors - A method for performing motion estimation based on at least a first VOP stored in a memory includes the steps of: receiving a request to read a data block indicative of at least a portion of the first VOP for predicting a second VOP that is temporally adjacent to the first VOP; utilizing a DMA module for determining whether the data block is a UMV block; translating a block address for retrieving at least a portion of the data block from the memory as a function of one or more parameters generated by the DMA module; and generating a complete data block as a function of the portion of the data block retrieved from the memory and the one or more parameters generated by the DMA module. | 04-18-2013 |
20130107957 | INTRA-PREDICTION MODE SELECTION WHILE ENCODING A PICTURE | 05-02-2013 |
20130113543 | MULTIPLICATION DYNAMIC RANGE INCREASE BY ON THE FLY DATA SCALING - An apparatus having a first circuit and a second circuit is disclosed. The first circuit may be configured to (i) receive two input signals. Each input signal generally carries a respective data value. Each data value may have a respective sign bit and a respective at least one guard bit. The first circuit may also be configured to (ii) scale each data value independently such that all of the respective guard bits have a same value as the respective sign bit and (iii) generate a product value in an output signal by adjusting an intermediate value based on the scaling of the data values. The second circuit may be configured to generate the intermediate value by multiplying the two data values as scaled. | 05-09-2013 |
20130117532 | INTERLEAVING ADDRESS MODIFICATION - An apparatus having a plurality of memory blocks and a circuit is disclosed. The circuit may be configured to (i) generate a second address by removing one or more first bits of a first address from one or more first locations defined by a first value, (ii) generate a third address by adding an offset value to the second address and (iii) generate a fourth address by inserting a selected one of a plurality of modifiers into the third address. The selected modifier may be inserted into the third address at the first locations. Each modifier is generally associated with a respective one of a plurality of buffers formed in the memory blocks. The circuit may also be configured to access the respective buffer of the fourth address. | 05-09-2013 |
20130159367 | Implementation of Negation in a Multiplication Operation Without Post-Incrementation - A multiplier circuit for generating a product of at least first and second multiplicands includes encoding circuitry comprising a plurality of encoders. Each of the encoders is operative to receive at least a subset of bits of the first multiplicand and to generate a partial product corresponding to the subset of bits of the first multiplicand. The encoding circuitry is further operative to incorporate a negation of the product as a function of at least a first control signal supplied to the multiplier circuit. The multiplier circuit further includes summation circuitry coupled with the encoding circuitry. The summation circuitry is operative to sum each of the partial products generated by the encoding circuitry to thereby generate the product without performing post-incrementation. | 06-20-2013 |
20130170585 | Biquad Infinite Impulse Response System Transformation - A BIIR system includes a first delay line for receiving at least one input data sample and generating delayed input samples as a function of the input data sample. The BIIR system further includes a second delay line including multiple delay elements connected in series for generating delayed output samples. An input of one of the delay elements receives at least one output data sample of the BIIR system. A summation element in the BIIR system generates the output data sample of the BIIR system as a function of an addition of at least first and second signals and a subtraction of at least a third signal. The third signal includes a first delayed output sample generated by the second delay line multiplied by a first prescribed value. The first delayed output sample and the output data sample are temporally nonadjacent to one another. | 07-04-2013 |
20130208796 | CACHE PREFETCH DURING A HIERARCHICAL MOTION ESTIMATION - An apparatus having a cache and a processor is disclosed. The cache may be configured to (i) buffer a first subset of a reference picture to facilitate a motion estimation of a current block at a first level of a hierarchical motion estimation and (ii) prefetch a second subset of the reference picture to the cache in response to an occurrence of a condition before the motion estimation is completed at the first level. The processor may be configured to calculate a plurality of scores by comparing the current block with the first subset of the reference picture. The second subset generally (i) resides at a second level of the hierarchical motion estimation and (ii) may be determined from the scores calculated prior to the occurrence of the condition. | 08-15-2013 |
20130219131 | LOW ACCESS TIME INDIRECT MEMORY ACCESSES - An apparatus having a memory and a controller is disclosed. The controller may be configured to (i) receive a read request from a processor, the read request comprising a first value and a second value, (ii) where the read request is an indirect memory access, (a) generate a first address in response to the first value, (b) read data stored in the memory at the first address and (c) generate a second address in response to the second value and the data, (iii) where the read request is a direct memory access, generate the second address in response to the second value and (iv) read a requested data stored in the memory at the second address. | 08-22-2013 |
20130262772 | ON-DEMAND ALLOCATION OF CACHE MEMORY FOR USE AS A PRESET BUFFER - A data processing system comprises data processing circuitry, a cache memory, and memory access circuitry. The memory access circuitry is operative to assign a memory address region to be allocated in the cache memory with a predefined initialization value. Subsequently, a portion of the cache memory is allocated to the assigned memory address region only after the data processing circuitry first attempts to perform a memory access on a memory address within the assigned memory address region. The allocated portion of the cache memory is then initialized with the predefined initialization value. | 10-03-2013 |
20130282780 | Method and Apparatus to Perform Floating Point Operations - A method of subtracting floating-point numbers includes determining whether a first sign associated with a first floating-point number is unequal to a second sign associated with a second floating-point number, determining whether a first exponent associated with the first floating-point number is less than a second exponent associated with the second floating-point number, negating a first mantissa associated with the first floating-point number when the first sign is unequal to the second sign and determining that the first exponent is less than the second exponent, and adding the first mantissa to a second mantissa associated with the second floating-point number when the first sign is unequal to the second sign and determining that the first exponent is less than the second exponent. Embodiments of a corresponding computer-readable medium and device are also provided. | 10-24-2013 |
20130286029 | ADJUSTING DIRECT MEMORY ACCESS TRANSFERS USED IN VIDEO DECODING - An apparatus having a first memory and a circuit is disclosed. The first memory may be configured to store a list having a plurality of read requests. The read requests generally (i) correspond to a plurality of blocks of a reference picture and (ii) are used to decode a current picture in a bitstream carrying video. The circuit may be configured to (i) rearrange the read requests in the list based on at least one of (a) a size of a buffer in a second memory and (b) a width of a data bus of the second memory and (ii) copy a portion of the reference picture from the second memory to a third memory using one or more direct memory access transfers in response to the list. | 10-31-2013 |
20130290677 | EFFICIENT EXTRACTION OF EXECUTION SETS FROM FETCH SETS - An apparatus having a buffer and a circuit is disclosed. The buffer may be configured to store a plurality of fetch sets. Each fetch set generally includes a prefix word and a plurality of instruction words. Each prefix word may include a plurality of symbols. Each symbol generally corresponds to a respective one of the instruction words. The circuit may be configured to (i) identify each of the symbols in each of the fetch sets having a predetermined value and (ii) parse the fetch sets into a plurality of execution sets in response to the symbols having the predetermined value. | 10-31-2013 |
20130298129 | CONTROLLING A SEQUENCE OF PARALLEL EXECUTIONS - An apparatus having a first circuit and a plurality of second circuits is disclosed. The first circuit may be configured to dispatch a plurality of sets in a sequence. Each set generally includes a plurality of instructions. The second circuits may be configured to (i) execute the sets during a plurality of execution cycles respectively and (ii) stop the execution in a particular one of the second circuits during one or more of the execution cycles in response to an expiration of a particular counter that corresponds to the particular second circuit. | 11-07-2013 |
20130305017 | COMPILED CONTROL CODE PARALLELIZATION BY HARDWARE TREATMENT OF DATA DEPENDENCY - An apparatus comprising a buffer and a processor. The buffer may be configured to store a plurality of fetch sets. The processor may be configured to perform a change of flow operation based upon at least one of (i) a comparison between addresses of two memory locations involved in each of two memory accessess, (ii) a first predefined prefix code, and (iii) a second predefined prefix code. | 11-14-2013 |
20130318307 | MEMORY MAPPED FETCH-AHEAD CONTROL FOR DATA CACHE ACCESSES - An apparatus including a tag comparison logic and a fetch-ahead generation logic. The tag comparison logic may be configured to present a miss address in response to detecting a cache miss. The fetch-ahead generation logic may be configured to select between a plurality of predefined fetch ahead policies in response to a memory access request and generate one or more fetch addresses based upon the miss address and a selected fetch ahead policy. | 11-28-2013 |