Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees


Michael C. Shebanow, Saratoga US

Michael C. Shebanow, Saratoga, CA US

Patent application numberDescriptionPublished
20100095117SECURE AND POSITIVE AUTHENTICATION ACROSS A NETWORK - One embodiment takes the form of a method for authenticating an identity of a first party to a second party, without any prior contact between the parties. Further, the first party may authenticate its identity to the second party while eliminating the ability of the second party to steal the first party's identity. A trusted authority may facilitate authenticating the identity of two or more communicating parties. In one embodiment, the authority may ensure the validity of the identification of a number of parties talking over a communications network. The parties communicating over the secure network trust what the authority states concerning the identities of the other parties in the network. Another embodiment may prevent the authority from monitoring which two parties are communicating to each other through the network.04-15-2010
20110072213INSTRUCTIONS FOR MANAGING A PARALLEL CACHE HIERARCHY - A method for managing a parallel cache hierarchy in a processing unit. The method includes receiving an instruction from a scheduler unit, where the instruction comprises a load instruction or a store instruction; determining that the instruction includes a cache operations modifier that identifies a policy for caching data associated with the instruction at one or more levels of the parallel cache hierarchy; and executing the instruction and caching the data associated with the instruction based on the cache operations modifier.03-24-2011
20110072243Unified Collector Structure for Multi-Bank Register File - One embodiment of the present invention sets forth a technique for collecting operands specified by an instruction. As a sequence of instructions is received the operands specified by the instructions are assigned to ports, so that each one of the operands specified by a single instruction is assigned to a different port. Reading of the operands from a multi-bank register file is scheduled by selecting an operand from each one of the different ports to produce an operand read request and ensuring that two or more of the selected operands are not stored in the same bank of the multi-bank register file. The operands specified by the operand read request are read from the multi-bank register file in a single clock cycle. Each instruction is then executed as the operands specified by the instruction are read from the multi-bank register file and collected over one or more clock cycles.03-24-2011
20110078358DEFERRED COMPLETE VIRTUAL ADDRESS COMPUTATION FOR LOCAL MEMORY SPACE REQUESTS - One embodiment of the present invention sets forth a technique for computing virtual addresses for accessing thread data. Components of the complete virtual address for a thread group are used to determine whether or not a cache line corresponding to the complete virtual address is not allocated in the cache. Actual computation of the complete virtual address is deferred until after determining that a cache line corresponding to the complete virtual address is not allocated in the cache.03-31-2011
20110078427TRAP HANDLER ARCHITECTURE FOR A PARALLEL PROCESSING UNIT - A trap handler architecture is incorporated into a parallel processing subsystem such as a GPU. The trap handler architecture minimizes design complexity and verification efforts for concurrently executing threads by imposing a property that all thread groups associated with a streaming multi-processor are either all executing within their respective code segments or are all executing within the trap handler code segment.03-31-2011
20110078689Address Mapping for a Parallel Thread Processor - A method for thread address mapping in a parallel thread processor. The method includes receiving a thread address associated with a first thread in a thread group; computing an effective address based on a location of the thread address within a local window of a thread address space; computing a thread group address in an address space associated with the thread group based on the effective address and a thread identifier associated with a first thread; and computing a virtual address associated with the first thread based on the thread group address and a thread group identifier, where the virtual address is used to access a location in a memory associated with the thread address to load or store data.03-31-2011
20110078692COALESCING MEMORY BARRIER OPERATIONS ACROSS MULTIPLE PARALLEL THREADS - One embodiment of the present invention sets forth a technique for coalescing memory barrier operations across multiple parallel threads. Memory barrier requests from a given parallel thread processing unit are coalesced to reduce the impact to the rest of the system. Additionally, memory barrier requests may specify a level of a set of threads with respect to which the memory transactions are committed. For example, a first type of memory barrier instruction may commit the memory transactions to a level of a set of cooperating threads that share an L1 (level one) cache. A second type of memory barrier instruction may commit the memory transactions to a level of a set of threads sharing a global memory. Finally, a third type of memory barrier instruction may commit the memory transactions to a system level of all threads sharing all system memories. The latency required to execute the memory barrier instruction varies based on the type of memory barrier instruction.03-31-2011
20110141122DISTRIBUTED STREAM OUTPUT IN A PARALLEL PROCESSING UNIT - A technique for performing stream output operations in a parallel processing system is disclosed. A stream synchronization unit is provided that enables the parallel processing unit to track batches of vertices being processed in a graphics processing pipeline. A plurality of stream output units is also provided, where each stream output unit writes vertex attribute data to one or more stream output buffers for a portion of the batches of vertices. A messaging protocol is implemented between the stream synchronization unit and the plurality of stream output units that ensures that each of the stream output units writes vertex attribute data for the particular batch of vertices distributed to that particular stream output unit in the same order in the stream output buffers as the order in which the batch of vertices was received from a device driver by the parallel processing unit.06-16-2011