Patent application number | Description | Published |
20100110083 | Metaprocessor for GPU Control and Synchronization in a Multiprocessor Environment - Included are embodiments of systems and methods for processing metacommands. In at least one exemplary embodiment a Graphics Processing Unit (GPU) includes a metaprocessor configured to process at least one context register, the metaprocessor including context management logic and a metaprocessor control register block coupled to the metaprocessor, the metaprocessor control register block configured to receive metaprocessor configuration data, the metaprocessor control register block further configured to define metacommand execution logic block behavior. Some embodiments include a Bus Interface Unit (BIU) configured to provide the access from a system processor to the metaprocessor and a GPU command stream processor configured to fetch a current context command stream and send commands for execution to a GPU pipeline and metaprocessor. | 05-06-2010 |
20100110089 | Multiple GPU Context Synchronization Using Barrier Type Primitives - Included are systems and methods for Graphics Processing Unit (GPU) synchronization. At least one embodiment of a system includes at least one producer GPU configured to receive data related to at least one context, the at least one producer GPU further configured to process at least a portion of the received data. Some embodiments include at least one consumer GPU configured to received data from the producer GPU, the consumer GPU further configured to stall execution of the received data until a fence value is received. | 05-06-2010 |
20100115249 | Support of a Plurality of Graphic Processing Units - Included are systems and methods for supporting a plurality of Graphics Processing Units (GPUs). At least one embodiment of a system includes a context status register configured to send data related to a status of at least one context and a context switch configuration register configured to send instructions related to at least one event for the at least one context. At least one embodiment of a system includes a context status management component coupled to the context status register and the context switch configuration register. | 05-06-2010 |
20110208946 | Dual Mode Floating Point Multiply Accumulate Unit - Disclosed are various embodiments of a stream processing unit for single instruction multiple data (SIMD) processing, wherein the stream processing unit executes a stage of a Multiply-Accumulate calculation. In one embodiment, the stream processing unit comprises a plurality of scalar arithmetic logic units (ALUs) configured to receive data having a plurality of data types. The number and type of scalar ALUs corresponds to an SIMD factor. In one embodiment, the scalar ALUs are executed sequentially with a delay being introduced in between execution of each of the scalar ALUs, wherein the delay corresponds to the SIMD factor. | 08-25-2011 |
Patent application number | Description | Published |
20080282034 | Memory Subsystem having a Multipurpose Cache for a Stream Graphics Multiprocessor - A method and a computing system are provided. The computing system may include a system memory configured to store data in a first data format. The computing system may also include a computational core comprising a plurality of execution units (EU). The computational core may be configured to request data from the system memory and to process data in a second data format. Each of the plurality of EU may include an execution control and datapath and a specialized L1 cache pool. The computing system may include a multipurpose L2 cache in communication with the each of the plurality of EU and the system memory. The multipurpose L2 cache may be configured to store data in the first data format and the second data format. The computing system may also include an orthogonal data converter in communication with at least one of the plurality of EU and the system memory. | 11-13-2008 |
20090189896 | Graphics Processor having Unified Shader Unit - Graphics processing units (GPUs) are used, for example, to process data related to three-dimensional objects or scenes and to render the three-dimensional data onto a two-dimensional display screen. One embodiment, among others, of a GPU is disclosed herein, wherein the GPU includes a control device configured to receive vertex, geometry and pixel data. The GPU further includes a plurality of execution units connected in parallel, each execution unit configured to perform a plurality of graphics shading functions on the vertex, geometry and pixel data. The control device is further configured to allocate a portion of the vertex, geometry and pixel data to each execution unit in a manner to substantially balance the load among the execution units. | 07-30-2009 |
20090189909 | Graphics Processor having Unified Cache System - Graphics processing units (GPUs) are used, for example, to process data related to three-dimensional objects or scenes and to render the three-dimensional data onto a two-dimensional display screen. One embodiment, among others, of a unified cache system used in a GPU comprises a data storage device and a storage device controller. The data storage device is configured to store graphics data processed by or to be processed by one or more shader units. The storage device controller is placed in communication with the data storage device. The storage device controller is configured to dynamically control a storage allocation of the graphics data within the data storage device. | 07-30-2009 |
20120092353 | Systems and Methods for Video Processing - A multi-shader system in a programmable graphics processing unit (GPU) for processing video data, includes a first shader stage configured to receive slice data from a frame buffer and perform variable length decoding (VLD), wherein the first shader stage outputs data to a first buffer within the frame buffer; a second shader stage configured to receive the output data from the first shader stage and perform transformation and motion compensation on the slice data, wherein the second shader stage outputs decoded slice data to a second buffer within the frame buffer; a third shader stage configured to receive the decoded slice data and perform in-loop deblocking filtering (IDF) on the frame buffer; a fourth shader stage configured to perform post-processing on the frame buffer; and a scheduler configured to schedule execution of the shader stages, the scheduler comprising a plurality of counter registers; wherein execution of the shader stages is synchronized utilizing the counter registers. | 04-19-2012 |