| Patent application number | Description | Published |
| 20090037918 | Thread sequencing for multi-threaded processor with instruction cache - Execution of the first thread of a new program is prioritized ahead of older threads for a previously running program. The new program is invoked during the execution of a thread of the previous program. The first thread of the program is prioritized ahead of the remaining threads of the previous program. In an embodiment of the invention, additional threads of the new program are also prioritized ahead of the older threads. A thread's context may include a table of constant values that can be referenced by each program and are shared by multiple threads. Changing the values in a constant table for a new thread is time intensive. To avoid changes to the constant table (and thereby save time), a higher priority status is conferred to the first thread that follows a change to the constant table. | 02-05-2009 |
| 20090276777 | Multiple Programs for Efficient State Transitions on Multi-Threaded Processors - A system and method to optimize processor performance and minimizing average thread latency by selectively loading a cache when a program state, resources required for execution of a program or the program itself change, is described. An embodiment of the invention supports a “cache priming program” that is selectively executed for a first thread/program/sub-routine of each process. Such a program is optimized for situations when instructions and other program data are not yet resident in cache(s), and/or whenever resources required for program execution or the program itself changes. By pre-loading the cache with two resources required for two instructions for only a first thread, average thread latency is reduced because the resources are already present in the cache. Since, such a mechanism is carried out only for one thread in a program cycle, pitfalls of a conventional general pre-fetch scheme that involves parsing of the program in advance to determine which resources and instructions will be needed at a later time, are avoided. | 11-05-2009 |
| 20110055511 | Interlocked Increment Memory Allocation and Access - A method of allocating a memory to a plurality of concurrent threads is presented. The method includes dynamically determining writer threads each having at least one pending write to the memory; and dynamically allocating respective contiguous blocks in the memory for each of the writer threads. Another method of allocating a memory to a plurality of concurrent threads includes launching the plurality of threads as a plurality of wavefronts, dynamically determining a group of wavefronts each having at least one thread requiring a write to the memory, and dynamically allocating respective contiguous blocks in the memory for each wavefront from the group of wavefronts. A corresponding method of assigning a memory to a plurality of reader threads includes determining a first number corresponding to a number of writer threads having a block allocated in said memory, launching a first number of reader threads, entering a first wavefront of said reader threads from said group of wavefronts to an atomic operation, and assigning a first block in the memory to the first wavefront during the corresponding atomic operation, where the first block is contiguous to a previously allocated block dynamically allocated to another wavefront from said group of wavefronts. Corresponding system embodiments and computer program product embodiments are also presented. | 03-03-2011 |