Patent application number | Description | Published |
20080266286 | GENERATION OF A PARTICLE SYSTEM USING A GEOMETRY SHADER - A geometry shader of a graphics processor is configured to generate at least a portion of a particle system. The geometry shader receives vertex data including a reference set of vertices. The geometry shader also receives control data including information on how to create additional vertices for the particle system using the vertex data. The geometry shader processes the vertex data and control data to generate the additional vertices for the particle system. In some embodiments, the control data also includes information on other attributes of the generated vertices. | 10-30-2008 |
20080266287 | DECOMPRESSION OF VERTEX DATA USING A GEOMETRY SHADER - A geometry shader of a graphics processor decompresses a set of vertex data representing a simplified model to create a more detailed representation. The geometry shader receives vertex data including a number of vertices representative of a simplified model. The geometry shader decompresses the vertex data by computing additional vertices to create the more detailed representation. In some embodiments, the geometry shader also receives rules data including information on how the vertex data is to be decompressed. | 10-30-2008 |
20080266296 | UTILIZATION OF SYMMETRICAL PROPERTIES IN RENDERING - The symmetrical properties of a group of vertices are leveraged to reconstruct the group using vertex data for a subset of the vertices and a set of control data. The subset of vertices is symmetrical to one or more other subsets of vertices in the group, and the control data includes information to reconstruct the one or more other subsets using the vertex data for the first set of vertices and symmetrical characteristics of the group. In some embodiments, reconstruction is performed using a geometry shader in a graphics processor to compute the additional vertices. | 10-30-2008 |
20090177864 | MULTIPROCESSOR COMPUTING SYSTEMS WITH HETEROGENEOUS PROCESSORS - Heterogeneous processors can cooperate for distributed processing tasks in a multiprocessor computing system. Each processor is operable in a “compatible” mode, in which all processors within a family accept the same baseline command set and produce identical results upon executing any command in the baseline command set. The processors also have a “native” mode of operation in which the command set and/or results may differ in at least some respects from the baseline command set and results. Heterogeneous processors with a compatible mode defined by reference to the same baseline can be used cooperatively for distributed processing by configuring each processor to operate in the compatible mode. | 07-09-2009 |
20100079454 | Single Pass Tessellation - A system and method for performing tessellation in a single pass through a graphics processor divides the processing resources within the graphics processor into sets for performing different tessellation operations. Vertex data and tessellation parameters are routed directly from one processing resource to another instead of being stored in memory. Therefore, a surface patch description is provided to the graphics processor and tessellation is completed in a single uninterrupted pass through the graphics processor without storing intermediate data in memory. | 04-01-2010 |
20110072211 | Hardware For Parallel Command List Generation - A method for providing state inheritance across command lists in a multi-threaded processing environment. The method includes receiving an application program that includes a plurality of parallel threads; generating a command list for each thread of the plurality of parallel threads; causing a first command list associated with a first thread of the plurality of parallel threads to be executed by a processing unit; and causing a second command list associated with a second thread of the plurality of parallel threads to be executed by the processing unit, where the second command list inherits from the first command list state associated with the processing unit. | 03-24-2011 |
20110072245 | HARDWARE FOR PARALLEL COMMAND LIST GENERATION - A method for providing state inheritance across command lists in a multi-threaded processing environment. The method includes receiving an application program that includes a plurality of parallel threads; generating a command list for each thread of the plurality of parallel threads; causing a first command list associated with a first thread of the plurality of parallel threads to be executed by a processing unit; and causing a second command list associated with a second thread of the plurality of parallel threads to be executed by the processing unit, where the second command list inherits from the first command list state associated with the processing unit. | 03-24-2011 |
20110080405 | HERMITE GREGORY PATCH FOR WATERTIGHT TESSELLATION - One embodiment of the present invention sets forth technique for watertight tessellation in a displaced subdivision surface. A subdivision surface is represented as a novel parametric quad patch that is continuous with respect to position (C0) and partial derivatives (C1) along boundaries as well as interior regions. The novel parametric quad patch is referred to herein as a Hermite Gregory patch and comprises a Hermite patch augmented to include a pair of twist vector parameters per vertex. Each pair of twist vectors is combined into one twist vector during evaluation, according to weights based on proximity to parametric boundaries. Evaluation yields an approximation mesh comprising a position for each vertex and a corresponding normal vector for the vertex. Displacement is performed based on the approximation mesh and a displacement map to generate a displaced approximation mesh that is reflective of the displaced subdivision surface. | 04-07-2011 |
20110081100 | USING A PIXEL OFFSET FOR EVALUATING A PLANE EQUATION - One embodiment of the present invention sets forth a technique controlling the pixel location at which the plane equation is evaluated. Multiple pixel offsets (dx, dy) may be specified that each define to a sub-pixel sample position. Attributes are then calculated for each sub-pixel sample position that is covered by a geometric primitive. One advantage of the technique is that anti-aliasing quality may be improved since high frequency color components may be selectively supersampled for particular geometric primitives. | 04-07-2011 |
20110085736 | METHOD FOR WATERTIGHT EVALUATION OF AN APPROXIMATE CATMULL-CLARK SURFACE - One embodiment of the present invention sets forth technique for watertight evaluation of Gregory patches for Catmull-Clark subdivision surfaces. Each boundary of each patch within a subdivision surface is configured to be owned by one related patch. In general, a given patch may own specific control points for the patch, while certain other control points for the patch may need to be reconstructed because the control points are owned by an adjacent patch. For a given patch, each control point along to a shared boundary is consistently generated using reconstruction data available to the patch. The reconstruction data is generated from values associated with a patch that owns the shared boundary. Because numerically identical data is used to evaluate each patch at each boundary, the boundaries are watertight. One advantage of the present invention is that watertight evaluation may be achieved using similar computational effort versus conventional non-watertight evaluation techniques. | 04-14-2011 |
20110285719 | APPROXIMATION OF STROKED HIGHER-ORDER CURVED SEGMENTS BY QUADRATIC B ZIER CURVE SEGMENTS - One embodiment of the present invention sets forth a technique for subdividing stroked higher-order curved segments into quadratic Bèzier curve segments. Path stroking may be accelerated when a GPU or other processor is configured to perform the subdivision operations. Cubic Bèzier path segments are subdivided into quadratic Bèzier curve segments and other lower-order segments at key features. The quadratic Bèzier curve segments approximate the cubic Bèzier path segments. A variance metric is computed for each quadratic Bèzier curve segment, and when the variance metric indicates that the quadratic Bèzier curve segment deviates by more than a threshold from the corresponding portion of the cubic Bèzier path segment, the quadratic Bèzier curve segment is further subdivided. The path composed of the quadratic Bèzier curve segments is then stroked by rendering hull geometry that encloses the path. | 11-24-2011 |
20110285722 | APPROXIMATION OF STROKED HIGHER-ORDER CURVED SEGMENTS BY QUADRATIC B ZIER CURVE SEGMENTS - One embodiment of the present invention sets forth a technique for subdividing stroked higher-order curved segments into quadratic Bèzier curve segments. Path stroking may be accelerated when a GPU or other processor is configured to perform the subdivision operations. Cubic Bèzier path segments are subdivided into quadratic Bèzier curve segments and other lower-order segments at key features. The quadratic Bèzier curve segments approximate the cubic Bèzier path segments. A variance metric is computed for each quadratic Bèzier curve segment, and when the variance metric indicates that the quadratic Bèzier curve segment deviates by more than a threshold from the corresponding portion of the cubic Bèzier path segment, the quadratic Bèzier curve segment is further subdivided. The path composed of the quadratic Bèzier curve segments is then stroked by rendering hull geometry that encloses the path. | 11-24-2011 |
20120089792 | EFFICIENT IMPLEMENTATION OF ARRAYS OF STRUCTURES ON SIMT AND SIMD ARCHITECTURES - One embodiment of the present invention sets forth a technique providing an optimized way to allocate and access memory across a plurality of thread/data lanes. Specifically, the device driver receives an instruction targeted to a memory set up as an array of structures of arrays. The device driver computes an address within the memory using information about the number of thread/data lanes and parameters from the instruction itself. The result is a memory allocation and access approach where the device driver properly computes the target address in the memory. Advantageously, processing efficiency is improved where memory in a parallel processing subsystem is internally stored and accessed as an array of structures of arrays, proportional to the SIMT/SIMD group width (the number of threads or lanes per execution group). | 04-12-2012 |
20140118351 | SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR INPUTTING MODIFIED COVERAGE DATA INTO A PIXEL SHADER - A system, method, and computer program product are provided for inputting modified coverage data into a pixel shader. In use, coverage data modified by a depth/stencil test is input into a pixel shader. Additionally, one or more actions are performed at the pixel shader, utilizing the modified coverage data. | 05-01-2014 |
20140176588 | TECHNIQUE FOR STORING SHARED VERTICES - A graphics processing unit includes a set of geometry processing units each configured to process graphics primitives in parallel with one another. A given geometry processing unit generates one or more graphics primitives or geometry objects and buffers the associated vertex data locally. The geometry processing unit also buffers different sets of indices to those vertices, where each such set represents a different graphics primitive or geometry object. The geometry processing units may then stream the buffered vertices and indices to global buffers in parallel with one another. A stream output synchronization unit coordinates the parallel streaming of vertices and indices by providing each geometry processing unit with a different base address within a global vertex buffer where vertices may be written. The stream output synchronization unit also provides each geometry processing unit with a different base address within a global index buffer where indices may be written. | 06-26-2014 |
20140176589 | TECHNIQUE FOR STORING SHARED VERTICES - A graphics processing unit includes a set of geometry processing units each configured to process graphics primitives in parallel with one another. A given geometry processing unit generates one or more graphics primitives or geometry objects and buffers the associated vertex data locally. The geometry processing unit also buffers different sets of indices to those vertices, where each such set represents a different graphics primitive or geometry object. The geometry processing units may then stream the buffered vertices and indices to global buffers in parallel with one another. A stream output synchronization unit coordinates the parallel streaming of vertices and indices by providing each geometry processing unit with a different base address within a global vertex buffer where vertices may be written. The stream output synchronization unit also provides each geometry processing unit with a different base address within a global index buffer where indices may be written. | 06-26-2014 |
20140253555 | MULTIRESOLUTION CONSISTENT RASTERIZATION - A technique for multiresolution consistent rasterization in which a setup unit calculates universal edge equations for a universal resolution. A rasterizer evaluates coverage data for two different resolutions based on the edge equations. The rasterizer evaluates coverage data for different effective pixel sizes—a large pixel size and a small pixel size. Optionally, the rasterizer may determine a first set of coverage data by performing conservative rasterization to determine coverage data for large pixels. Optionally, the rasterizer may then determine a second set of coverage data by performing standard rasterization for small pixels. Optionally, for the second set of coverage data, the rasterizer may evaluate only the small pixels that are within large pixels in the first set of coverage data that evaluate as covered. | 09-11-2014 |
20140267224 | HANDLING POST-Z COVERAGE DATA IN RASTER OPERATIONS - Techniques are disclosed for storing post-z coverage data in a render target. A color raster operations (CROP) unit receives a coverage mask associated with a portion of a graphics primitive, where the graphics primitive intersects a pixel that includes a multiple samples, and the portion covers at least one sample. The CROP unit stores the coverage mask in a data field in the render target at a location associated with the pixel. One advantage of the disclosed techniques is that the GPU computes color and other pixel information only for visible fragments as determined by post-z coverage data. The GPU does not compute color and other pixel information for obscured fragments, thereby reducing overall power consumption and improving overall render performance. | 09-18-2014 |
20140267232 | CONSISTENT VERTEX SNAPPING FOR VARIABLE RESOLUTION RENDERING - A system, method, and computer program product are provided for adjusting vertex positions. One or more viewport dimensions are received and a snap spacing is determined based on the one or more viewport dimensions. The vertex positions are adjusted to a grid according to the snap spacing. The precision of the vertex adjustment may increase as at least one dimension of the viewport decreases. The precision of the vertex adjustment may decrease as at least one dimension of the viewport increases. | 09-18-2014 |
20140267238 | CONSERVATIVE RASTERIZATION OF PRIMITIVES USING AN ERROR TERM - A system, method, and computer program product are provided for conservative rasterization of primitives using an error term. In use, an edge equation is determined for each edge of a primitive, the edge equation having coefficients defining the edge of the primitive. Each edge of the primitive is shifted to enlarge the primitive by modifying coefficients of the edge equation defining the edge by an error term that is a predetermined amount. Pixels that intersect the primitive are then determined using the enlarged primitive. | 09-18-2014 |
20140267260 | SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR EXECUTING PROCESSES INVOLVING AT LEAST ONE PRIMITIVE IN A GRAPHICS PROCESSOR, UTILIZING A DATA STRUCTURE - A system, method, and computer program product are provided for executing processes involving at least one primitive in a graphics processor, utilizing a data structure. In operation, a data structure is associated with at least one primitive. Additionally, a plurality of processes involving the at least one primitive are executed in a graphics processor, utilizing the data structure. Moreover, the plurality of processes include at least one of selecting at least one surface or portion thereof to which to render, or selecting at least one of a plurality of viewports. | 09-18-2014 |
20140267264 | GENERATING ANTI-ALIASED VOXEL DATA - One embodiment of the present invention sets forth a technique for performing voxelization. The technique involves identifying a voxel that is intersected by a first graphics primitive that has a front side and a back side and selecting a plurality of sample points within the voxel. The technique further involves determining, for each sample point included in the plurality of sample points, whether the sample point is located on the front side of the first graphics primitive or on the back side of the first graphics primitive. Finally, the technique involves storing, for at least a first sample point included in the plurality of sample points, a first result in a voxel mask reflecting whether the first sample point is located on the front side of the first graphics primitive or on the back side of the first graphics primitive. | 09-18-2014 |
20140267265 | GENERATING ANTI-ALIASED VOXEL DATA - One embodiment of the present invention sets forth a technique for performing voxelization. The technique involves determining that a first graphics primitive intersects a voxel and calculating a first set of coefficients associated with a first plane defined by the intersection of the first graphics primitive and the voxel. The technique further involves determining that a second graphics primitive intersects the voxel and calculating a second set of coefficients associated with a second plane defined by the intersection of the second graphics primitive and the voxel. The technique further involves calculating a third set of coefficients associated with a third surface based on the first set of coefficients and the second set of coefficients. The technique further involves calculating at least one of an amount of the voxel that is located on the back side of the third surface and an occlusion value based on the third set of coefficients. | 09-18-2014 |
20140267266 | GENERATING ANTI-ALIASED VOXEL DATA - One embodiment of the present invention sets forth a technique for performing voxelization. The technique involves determining that a voxel is intersected by a first graphics primitive that has a front side and a back side and selecting one or more reference points within the voxel. The technique further involves, for each reference point, determining a distance from the reference point to the first graphics primitive and storing a first scalar value in an array based on the distance. The sign of the first scalar value reflects whether the reference point is located on the front side of the first graphics primitive or on the back side of the first graphics primitive. | 09-18-2014 |
20140267276 | SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR GENERATING PRIMITIVE SPECIFIC ATTRIBUTES - A system, method, and computer program product are provided for generating primitive-specific attributes. In operation, it is determined whether a portion of a graphics processor is operating in a predetermined mode. If it is determined that the portion of the graphics processor is operating in the predetermined mode, only one or more primitive-specific attributes are generated in association with a primitive. | 09-18-2014 |
20140267315 | MULTI-SAMPLE SURFACE PROCESSING USING ONE SAMPLE - A system, method, and computer program product are provided for multi-sample processing. The multi-sample pixel data is received and an encoding state associated with the multi-sample pixel data is determined. Data for one sample of a multi-sample pixel and the encoding state are provided to a processing unit. The one sample of the multi-sample pixel is processed by the processing unit to generate processed data for the one sample that represents processed multi-sample pixel data for all samples of the multi-sample pixel or two or more samples of the multi-sample pixel. | 09-18-2014 |
20140267356 | MULTI-SAMPLE SURFACE PROCESSING USING SAMPLE SUBSETS - A system, method, and computer program product are provided for multi-sample processing. The multi-sample pixel data is received and is analyzed to identify subsets of samples of a multi-sample pixel that have equal data, such that data for one sample in a subset represents multi-sample pixel data for all samples in the subset. An encoding state is generated that indicates which samples of the multi-sample pixel are included in each one of the subsets. | 09-18-2014 |
20140267366 | TARGET INDEPENDENT RASTERIZATION WITH MULTIPLE COLOR SAMPLES - A graphics processing pipeline within a parallel processing unit (PPU) is configured to perform path rendering by generating a collection of graphics primitives that represent each path to be rendered. The graphics processing pipeline determines the coverage of each primitive at a number of stencil sample locations within each different pixel. Then, the graphics processing pipeline reduces the number of stencil samples down to a smaller number of color samples, for each pixel. The graphics processing pipeline is configured to modulate a given color sample associated with a given pixel based on the color values of any graphics primitives that cover the stencil samples from which the color sample was reduced. The final color of the pixel is determined by downsampling the color samples associated with the pixel. | 09-18-2014 |
20150049104 | RENDERING TO MULTI-RESOLUTION HIERARCHIES - One embodiment of the present invention includes techniques for processing a multi-resolution hierarchy, where an application configures a ROP unit to render all the levels included in the multi-resolution hierarchy to a single composite render target. The ROP unit renders memory pages to the composite render target in pitch order. In contrast, the texture unit accesses the composite render target with memory pages in pitch order for each level of the hierarchy. The application configures the MMU to ensure that the composite render target is correctly interpreted by the texture unit. Notably, the MMU translates ROP unit virtual addresses and texture unit virtual addresses using different mapping strategies to the same physical address space. One advantage of the disclosed embodiments is that rendering to the multi-resolution hierarchy does not require the CPU to execute the state parameter changes that are associated with rendering the different hierarchical levels using prior-art techniques. | 02-19-2015 |
20150054827 | SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR PASSING ATTRIBUTE STRUCTURES BETWEEN SHADER STAGES IN A GRAPHICS PIPELINE - A system, method, and computer program product are provided for passing attribute structures between shader stages of a processing pipeline. The method includes the steps of receiving data represented at a first level by a processing pipeline including an upstream shader unit, a downstream shader unit, and a processing unit. The upstream shader unit processes the data to generate a first set of attributes corresponding to the data represented at a second level. The upstream shader unit also stores the first set of the attributes in a first portion of a memory system that can be read by the downstream shader unit and any shader units that are downstream in the processing pipeline relative to the upstream shader unit. In one embodiment, the processing unit is coupled between the upstream shader unit and the downstream shader unit. | 02-26-2015 |
20150084952 | SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR RENDERING A SCREEN-ALIGNED RECTANGLE PRIMITIVE - A system, method, and computer program product are provided for processing a screen-aligned rectangle within a processing pipeline. The method includes the steps of determining coordinates for a screen-aligned rectangle by projecting a specification line onto a screen-space plane, computing a plane equation associated with the specification line, and rasterizing the screen-aligned rectangle that is within the screen-space plane based on the coordinates and the plane equation. The specification line is within a three-dimensional (3D) space. The plane equation is associated with a rendering parameter for the screen-aligned rectangle. The plane equation may be evaluated by a pixel shader in conjunction with processing the screen-aligned rectangle. | 03-26-2015 |
20150084974 | TECHNIQUES FOR INTERLEAVING SURFACES - One embodiment sets forth a method for allocating memory to surfaces. A software application specifies surface data, including interleaving state data. Based on the interleaving state data, a surface access unit bloats addressees derived from discrete coordinates associated with the surface, creating a bloated virtual address space with a predictable pattern of addresses that do not correspond to data. Advantageously, by creating predictable regions of addresses that do not correspond to data, the software application program may configure the surface to share physical memory space with one or more other surfaces. In particular, the software application may map the virtual address space together with one or more virtual address spaces corresponding to complementary data patterns to the same physical base address. And, by overlapping the virtual address spaces onto the same pages in physical address space, the physical memory may be more densely packed than by using prior-art allocation techniques. | 03-26-2015 |