Patent application number | Description | Published |
20090096788 | METHOD AND APPARATUS FOR INCREASING EFFICIENCY OF TRANSMISSION AND/OR STORAGE OF RAYS FOR PARALLELIZED RAY INTERSECTION TESTING - For ray tracing, methods, apparatus, and computer readable media provide efficient transmission and/or storage of rays between ray emitters, and an intersection testing resource. Ray emitters, during emission of a plurality of rays, identify a shared attribute of each ray of the plurality, and represent that attribute as shared ray data. The shared ray data, and other ray data sufficient to determine both an origin and a direction for each ray of the plurality, are transmitted. Functionality in the intersection testing resource receives the shared ray data and the other ray data, and interprets the shared ray data and the other ray data to determine an origin and direction for each ray of the plurality, and provides those rays for intersection testing. Rays can be stored in the shared attribute format in the intersection testing resource and data elements representing the rays can be constructed later. Programmable receiving functionality of the intersection testing resource can accommodate many ray types and other situations. | 04-16-2009 |
20090096789 | METHOD, APPARATUS, AND COMPUTER READABLE MEDIUM FOR LIGHT ENERGY ACCOUNTING IN RAY TRACING - For ray tracing systems, described methods, media, apparatuses provide for accounting of light energy that will be collected at pixels of a 2-D representation without recursive closure of a tree of ray/primitive intersections, and also provide for adaptivity in ray tracing based on importance indicators of each ray, such as a weight, which may be carried in data structures representative of the rays. Examples of such adaptivity may include determining a number of children to issue for shading an identified intersecting primitive, culling rays, and adding rays to achieve more accurate sampling, if desired. All such adaptivity may be triggered with goal-based indicators, such as a threshold value representative of rendering progress to a time-based goal, such as a frame rate. | 04-16-2009 |
20090128562 | SYSTEMS AND METHODS FOR RENDERING WITH RAY TRACING - For ray tracing scenes composed of primitives, systems and methods accelerate ray/primitive intersection identification by testing rays against elements of geometry acceleration data (GAD) in a parallelized intersection testing resource. Groups of rays can be described as shared attribute information and individual ray data for efficient ray data transfer between a host processor and the testing resource. The host processor also hosts shading and/or management processes controlling the testing resource and adapting the ray tracing, as necessary or desirable, to meet criteria, while reducing degradation of rendering quality. The GAD elements can be arranged in a graph, and rays can be collected into collections based on whether a ray intersects a given element. When a collection is deemed ready for further testing, it is tested for intersection with GAD elements connected, in the graph, to the given element. The graph can be hierarchical such that rays of a given collection are tested against children of the GAD element associated with the given collection. | 05-21-2009 |
20090244058 | APPARATUS AND METHOD FOR RAY TRACING WITH BLOCK FLOATING POINT DATA - Systems and methods include high throughput and/or parallelized ray/geometric shape intersection testing using intersection testing resources accepting and operating with block floating point data. Block floating point data sacrifices precision of scene location in ways that maintain precision where more beneficial, and allow reduced precision where beneficial. In particular, rays, acceleration structures, and primitives can be represented in a variety of block floating point formats, such that storage requirements for storing such data can be reduced. Hardware accelerated intersection testing can be provided with reduced sized math units, with reduced routing requirements. A driver for hardware accelerators can maintain full-precision versions of rays and primitives to allow reduced communication requirements for high throughput intersection testing in loosely coupled systems. Embodiments also can include using BFP formatted data in programmable test cells or more general purpose processing elements. | 10-01-2009 |
20090262132 | ARCHITECTURES FOR PARALLELIZED INTERSECTION TESTING AND SHADING FOR RAY-TRACING RENDERING - Ray tracing scenes is accomplished using a plurality of intersection testing resources coupled with a plurality of shading resources, communicative in the aggregate through links/queues. A queue from testing to shading comprises respective ray/primitive intersection indications, comprising a ray identifier. A queue from shading to testing comprises identifiers of new rays to be tested, wherein data defining the rays is separately stored in memories distributed among the intersection testing resources. Ray definition data can be retained in distributed memories until rays complete intersection testing, and be selected for testing multiple times based on ray identifier. A structure of acceleration shapes can be used. Packets of ray identifiers and shape data can be passed among the intersection testing resources, and each resource can test rays identified in the packet, and for which definition data is present in its memory. Test results for acceleration shapes are used to collect rays against acceleration shapes, and closest detection ray/primitive intersections are indicated by sending ray identifiers to shading resources. | 10-22-2009 |
20090284523 | METHOD, APPARATUS, AND COMPUTER READABLE MEDIUM FOR ACCELERATING INTERSECTION TESTING IN RAY-TRACING RENDERING - For ray tracing scenes composed of primitives, systems and methods accelerate intersection testing by testing rays against elements of geometry acceleration data (GAD) arranged in a graph of nodes, where pairs of nodes are connected by edges, and each element bounds a varying granularity selection of the primitives. Upon detection of intersections between rays and elements, references to the rays are added to respective collections associated with the elements. Further processing of those rays is deferred until rays of a given collection are determined ready, and then rays from such a ready collection are tested for intersection with elements of GAD connected by edges to the element associated with the ready collection. When a primitive is bounded by no higher granularity GAD element, it is tested for intersection, and indications of intersection are output. Some examples encourage production of many secondary rays and collect such rays for parallelized testing, regardless of traversal order, camera ray association, or a time when each ray was spawned. | 11-19-2009 |
20090289939 | SYSTEMS AND METHODS FOR CONCURRENT RAY TRACING - For ray tracing scenes composed of primitives, systems and methods can traverse rays through an acceleration structure. The traversal can be implemented by concurrently testing a plurality of nodes of the acceleration structure for intersection with a sequence of one or more rays. Such testing can occur in a plurality of test cells. Leaf nodes of the acceleration structure can bound primitives, and a sequence primitives can be tested concurrently for intersection in the test cells against a plurality of rays that have intersected a given leaf node. Intersection testing of a particular leaf node can be deferred until a sufficient quantity of rays have been collected for that node. | 11-26-2009 |
20090322752 | RAY TRACING SYSTEM ARCHITECTURES AND METHODS - Aspects comprise systems implementing ray tracing functionality according to example architectures. In one example, rays are collected into collections against elements of an acceleration structure, which in some cases are associated with objects composing a scene being ray traced. Indications of detected ray intersections also can be collected in an output buffer, and in some examples, the output buffer can comprise a plurality of portions, each associated with a scene object, or a common portion of code to be executed during shading. Buffer contents can be accessed in a block read. An intersection shading resource can load data to be used in shading the intersections for the identified rays, and locally storing that data for use in shading those intersections. | 12-31-2009 |
20100073369 | SYSTEMS AND METHODS FOR A RAY TRACING SHADER API - Aspects include API interfaces for interfacing shaders with other components and/or code modules that provide ray tracing functionality. For example, API calls may allow direct contribution of light energy to a buffer for an identified pixel, and allow emission of new rays for intersection testing alone or in bundles. The API also can provide a mechanism for associating arbitrary data with ray definition data defining a ray to be tested through a shader using the emit ray call. The arbitrary data is provided to a shader associated with an object that is identified subsequently as having been intersected by the ray. The data can include code, or a pointer to code, that can be used by or run after the shader. The data also can be propagated through a series of shaders, and associated with rays instantiated in each shader. | 03-25-2010 |
20100073370 | SYSTEMS AND METHODS FOR A RAY TRACING SHADER API - Aspects include API interfaces for interfacing shaders with other components and/or code modules that provide ray tracing functionality. For example, API calls may allow direct contribution of light energy to a buffer for an identified pixel, and allow emission of new rays for intersection testing alone or in bundles. The API also can provide a mechanism for associating arbitrary data with ray definition data defining a ray to be tested through a shader using the emit ray call. The arbitrary data is provided to a shader associated with an object that is identified subsequently as having been intersected by the ray. The data can include code, or a pointer to code, that can be used by or run after the shader. The data also can be propagated through a series of shaders, and associated with rays instantiated in each shader. Recursive shaders can be recompiled as non-recursive shaders interfacing with API semantics according to the description. | 03-25-2010 |
20100097372 | SYNTHETIC ACCELERATION SHAPES FOR USE IN RAY TRACING - A synthetic acceleration shape bound primitives composing a 3-D scene, and is defined using a group of fundamental shapes arranged to bound the primitives, and for which intersection results for group members yield an ultimate intersection testing result for the synthetic shape, using a logical operator. For example, two or more spheres are used to bound an object so that each of the spheres is larger than a minimum necessary to bound the object, and a volume defined by an intersection between the shapes defines a smaller volume in which the object is bounded. A ray is found to potentially intersect the object only if it intersects both spheres. In another example, an element may be defined by a volumetric union of component elements. Indicators can determine how groups of shapes should be interpreted. Synthetic shapes can be treated as a single element in a graph or hierarchical arrangement of acceleration elements. | 04-22-2010 |
20100231589 | RAY TRACING USING RAY-SPECIFIC CLIPPING - Systems, methods, and computer readable media embodying such methods provide for allowing specification of per-ray clipping information that defines a sub-portion of a 3-D scene in which the ray should be traced. The clipping information can be specified as a clip distance from a ray origin, as an end value of a parametric ray definition, or alternatively the clipping information can be built into a definition of the ray to be traced. The clipping information can be used to check whether portions of an acceleration structure need to be traversed, as well as whether primitives should be tested for intersection. Other aspects include specifying a default object that can be returned as intersected when no primitive was intersected within the sub-portion defined for testing. Further aspects include allowing provision of flags interpretable by an intersection testing resource that control what the intersection testing resource does, and/or what information it reports after conclusion of testing of a ray. | 09-16-2010 |
20100328310 | SYSTEMS AND METHODS OF DEFINING RAYS FOR RAY TRACING RENDERING - Some aspects pertain to ray data storage for ray tracing rendering. Attribute data for a first ray can be stored. To define a second ray, data defining such can comprise a reference to the first ray (in one example) and attribute source information indicative of shared attributes between the first and second rays. The attribute source information can be shared among many rays, and can be selected based on ray type. Definition data for unshared attributes can be explicit with the second ray. A plurality of rays can reference one ray for shared attribute data. Referencing rays can be counted and decremented as referencing rays complete. Shared attributes can be indicated with masks. Interface modules can service ray data read and write requests made by shaders, and shaders can explicitly reference attributes of rays, without using such interfacing modules. Data structures can be used as attribute sources without being associated with particular rays, and can be defined and selected as attribute data sources based on ray type. | 12-30-2010 |
20100332523 | SYSTEMS AND METHODS FOR PHOTON MAP QUERYING - In one aspect, photon queries are answered using systems and methods of traversal of collections of photon queries through an acceleration structure, to identify photons meeting a specification of a given query. Such systems and methods can be extended to satisfying similarity queries in an n-dimensional parameter space. Queries can be associated with code (or pointers to code) that are run to achieve closure of that query. Queries can cause further queries to be emitted. Arbitrary data can be passed from one query to another; for example, parameters defined internally to the code modules themselves (e.g., the parameters do not need to have a definition or meaning to the systems or within the methods). | 12-30-2010 |
20110050698 | ARCHITECTURES FOR PARALLELIZED INTERSECTION TESTING AND SHADING FOR RAY-TRACING RENDERING - Ray tracing scenes is accomplished using a plurality of intersection testing resources coupled with a plurality of shading resources, communicative in the aggregate through links/queues. A queue from testing to shading comprises respective ray/primitive intersection indications, comprising a ray identifier. A queue from shading to testing comprises identifiers of new rays to be tested, wherein data defining the rays is separately stored in memories distributed among the intersection testing resources. Ray definition data can be retained in distributed memories until rays complete intersection testing, and be selected for testing multiple times based on ray identifier. A structure of acceleration shapes can be used. Packets of ray identifiers and shape data can be passed among the intersection testing resources, and each resource can test rays identified in the packet, and for which definition data is present in its memory. Test results for acceleration shapes are used to collect rays against acceleration shapes, and closest detection ray/primitive intersections are indicated by sending ray identifiers to shading resources. | 03-03-2011 |
20110181613 | METHOD, APPARATUS, AND COMPUTER READABLE MEDIUM FOR LIGHT ENERGY ACCOUNTING IN RAY TRACING - For ray tracing systems, described methods, media, apparatuses provide for accounting of light energy that will be collected at pixels of a 2-D representation without recursive closure of a tree of ray/primitive intersections, and also provide for adaptivity in ray tracing based on importance indicators of each ray, such as a weight, which may be carried in data structures representative of the rays. Examples of such adaptivity may include determining a number of children to issue for shading an identified intersecting primitive, culling rays, and adding rays to achieve more accurate sampling, if desired. All such adaptivity may be triggered with goal-based indicators, such as a threshold value representative of rendering progress to a time-based goal, such as a frame rate. | 07-28-2011 |
20120001912 | RAY TRACING SYSTEM ARCHITECTURES AND METHODS - Aspects comprise systems implementing 3-D graphics processing functionality in a multiprocessing system. Control flow structures are used in scheduling instances of computation in the multiporcessing system, where different points in the control flow structure serve as points where deferral of some instances of computation can be performed in favor of scheduling other instances of computation. In some examples, the control flow structure identifies particular tasks, such as intersection testing of a particular portion of an acceleration structure, and a particular element of shading code. In some examples, the aspects are used in 3-D graphics processing systems that can perform ray tracing based rendering. | 01-05-2012 |
20120133654 | VARIABLE-SIZED CONCURRENT GROUPING FOR MULTIPROCESSING - Aspects include, for example, a method for interpreting information in a computer program, or profiling such a program to estimate a group size for instances of that program (program module, or portion thereof). Such a method can be used in a system that supports collecting outputs of executing instances, where those outputs can specify new program instances. Scheduling of new instances (or allocation of resources for executing such instances) can be deferred. A trigger to begin scheduling (or allocation) for a collection of instances uses a target group size for that program. Thus, different programs can have different group sizes, which can be set explicitly, or based on profiling. The profiling can occur during one or more of pre-execution and during execution. The group size estimate can be an input into an algorithm that also accounts for system state during execution. | 05-31-2012 |
20120139926 | MEMORY ALLOCATION IN DISTRIBUTED MEMORIES FOR MULTIPROCESSING - In some aspects, finer grained parallelism is achieved by segmenting programmatic workloads into smaller discretized portions, where a first element can be indicative both of a configuration or program to be executed, and a first data set to be used in such execution, while a second element can be indicative of a second data element or group. The discretized portions can cause program execute on distributed processors. Approaches to selecting processors, and allocating local memory associated with those processors are disclosed. In one example, discretized portions that share a program have an anti-affinity to cause dispersion, for initial execution assignment. Flags, such as programmer and compiler generated flags can be used in determining such allocations. Workloads can be grouped according to compatibility of memory usage requirements. | 06-07-2012 |
20120249553 | ARCHITECTURES FOR CONCURRENT GRAPHICS PROCESSING OPERATIONS - Ray tracing, and more generally, graphics operations taking place in a 3-D scene, involve a plurality of constituent graphics operations. Scheduling of graphics operations for concurrent execution on a computer may increase throughput. In aspects herein, constituent graphics operations are scheduled in groups, having members selected according to disclosed aspects. Processing for specific graphics operations in a group can be deferred if all the operations in the group cannot be further tested concurrently. Graphics operations that have been deferred are recombined into two or more different groups and ultimately complete processing, through a required number of iterations of such process. In one application, the performance of the graphics operations perform a search in which respective 1:1 matches between different types of geometric shapes involved in the 3-D scene are identified. For example, closest intersections between rays and scene geometry can be identified by processing scheduled according to disclosed aspects. | 10-04-2012 |
20120324458 | SCHEDULING HETEROGENOUS COMPUTATION ON MULTITHREADED PROCESSORS - Aspects include computation systems that can identify computation instances that are not capable of being reentrant, or are not reentrant capable on a target architecture, or are non-reentrant as a result of having a memory conflict in a particular execution situation. A system can have a plurality of computation units, each with an independently schedulable SIMD vector. Computation instances can be defined by a program module, and a data element(s) that may be stored in a local cache for a particular computation unit. Each local cache does not maintain coherency controls for such data elements. During scheduling, a scheduler can maintain a list of running (or runnable) instances, and attempt to schedule new computation instances by determining whether any new computation instance conflicts with a running instance and responsively defer scheduling. Memory conflict checks can be conditioned on a flag or other indication of the potential for non-reentrancy. | 12-20-2012 |
20130050213 | SYSTEMS AND METHODS FOR RENDERING WITH RAY TRACING - For ray tracing scenes composed of primitives, systems and methods—accelerate ray/primitive intersection identification by testing rays against elements of geometry acceleration data (GAD) in a parallelized intersection testing resource. Groups of rays can be described as shared attribute information and individual ray data for ray data transfer. A host hosts shading and/or management processes can control the testing resource and adapting the ray tracing. The GAD elements can be arranged in a graph, and rays collected into collections based on whether a ray intersects a given element. When a collection is deemed ready for further testing, it is tested for intersection with GAD elements connected, in the graph, to the given element. The graph can be hierarchical such that rays of a given collection are tested against children of the GAD element associated with the given collection. | 02-28-2013 |
20130069960 | MULTISTAGE COLLECTOR FOR OUTPUTS IN MULTIPROCESSOR SYSTEMS - Aspects include a multistage collector to receive outputs from plural processing elements. Processing elements may comprise (each or collectively) a plurality of clusters, with one or more ALUs that may perform SIMD operations on a data vector and produce outputs according to the instruction stream being used to configure the ALU(s). The multistage collector includes substituent components each with at least one input queue, a memory, a packing unit, and an output queue; these components can be sized to process groups of input elements of a given size, and can have multiple input queues and a single output queue. Some components couple to receive outputs from the ALUs and others receive outputs from other components. Ultimately, the multistage collector can output groupings of input elements. Each grouping of elements (e.g., at input queues, or stored in the memories of component) can be formed based on matching of index elements. | 03-21-2013 |
20130113800 | SYSTEMS AND METHODS FOR 3-D SCENE ACCELERATION STRUCTURE CREATION AND UPDATING - Systems and methods for producing an acceleration structure provide for subdividing a 3-D scene into a plurality of volumetric portions, which have different sizes, each being addressable using a multipart address indicating a location and a relative size of each volumetric portion. A stream of primitives is processed by characterizing each according to one or more criteria, selecting a relative size of volumetric portions for use in bounding the primitive, and finding a set of volumetric portions of that relative size which bound the primitive. A primitive ID is stored in each location of a cache associated with each volumetric portion of the set of volumetric portions. A cache location is selected for eviction, responsive to each cache eviction decision made during the processing. An element of an acceleration structure according to the contents of the evicted cache location is generated, responsive to the evicted cache location. | 05-09-2013 |
20130113801 | Profiling Ray Tracing Renderers - A profiler for a ray tracing renderer interfaces with the renderer to collect rendering information, such as ray definition information, a pixel origin, objects hit, shader invocation, and related rays. In an interface, an artist views a simplified 3-D scene model and a rendered 2-D image. A pixel in the 2-D image is selectable; the profiler responds by populating the simplified 3-D scene with rays that contributed to that pixel. Rays can be displayed in the simplified 3-D scene to visually convey information about characteristics of each ray, such as whether the ray intersected an object, portions of the scene where it is occluded, and a direction. Statistics can be produced by the profiler that convey information such as relative computational complexity to render particular pixels. The profiler can step through multiple passes (e.g., multiple frames and passes of a multipass rendering), and the UI can allow pausing and stepping. | 05-09-2013 |
20130147803 | SYSTEMS AND METHODS FOR CONCURRENT RAY TRACING - For ray tracing scenes composed of primitives, systems and methods can traverse rays through an acceleration structure. The traversal can be implemented by concurrently testing a plurality of nodes of the acceleration structure for intersection with a sequence of one or more rays. Such testing can occur in a plurality of test cells. Leaf nodes of the acceleration structure can bound primitives, and a sequence primitives can be tested concurrently for intersection in the test cells against a plurality of rays that have intersected a given leaf node. Intersection testing of a particular leaf node can be deferred until a sufficient quantity of rays have been collected for that node. | 06-13-2013 |
20130222402 | Graphics Processor with Non-Blocking Concurrent Architecture - In some aspects, systems and methods provide for forming groupings of a plurality of independently-specified computation workloads, such as graphics processing workloads, and in a specific example, ray tracing workloads. The workloads include a scheduling key, which is one basis on which the groupings can be formed. Workloads grouped together can all execute from the same source of instructions, one or more different private data elements. Such workloads can recursively instantiate other workloads that reference the same private data elements. In some examples, the scheduling key can be used to identify a data element to be used by all the workloads of a grouping. Memory conflicts to private data elements are handled through scheduling of non-conflicted workloads or specific instructions an deferring conflicted workloads instead of locking memory locations. | 08-29-2013 |
20140071123 | COMPACTING RESULTS VECTORS BETWEEN STAGES OF GRAPHICS PROCESSING - Ray tracing, and more generally, graphics operations taking place in a 3-D scene, involve a plurality of constituent graphics operations. Responsibility for executing these operations can be distributed among different sets of computation units. The sets of computation units each can execute a set of instructions on a parallelized set of input data elements and produce results. These results can be that the data elements can be categorized into different subsets, where each subset requires different processing as a next step. The data elements of these different subsets can be coalesced so that they are contiguous in a results set. The results set can be used to schedule additional computation, and if there are empty locations of a scheduling vector (after accounting for the members of a given subset), then those empty locations can be filled with other data elements that require the same further processing as that subset. | 03-13-2014 |
20140078145 | SYSTEMS AND METHODS FOR PROGRAM INTERFACES IN MULTIPASS RENDERING - Aspects include API interfaces for interfacing shaders with other components and/or code modules that provide ray tracing functionality. For example, API calls may allow direct contribution of light energy to a buffer for an identified pixel, and allow emission of new rays for intersection testing alone or in bundles. The API also can provide a mechanism for associating arbitrary data with ray definition data defining a ray to be tested through a shader using the emit ray call. The arbitrary data is provided to a shader associated with an object that is identified subsequently as having been intersected by the ray. The data can include code, or a pointer to code, that can be used by or run after the shader. The data also can be propagated through a series of shaders, and associated with rays instantiated in each shader. Recursive shaders can be recompiled as non-recursive shaders interfacing with API semantics according to the description. | 03-20-2014 |
20140232720 | RAY TRACING SYSTEM ARCHITECTURES AND METHODS - Aspects comprise systems implementing 3-D graphics processing functionality in a multiprocessing system. Control flow structures are used in scheduling instances of computation in the multiporcessing system, where different points in the control flow structure serve as points where deferral of some instances of computation can be performed in favor of scheduling other instances of computation. In some examples, the control flow structure identifies particular tasks, such as intersection testing of a particular portion of an acceleration structure, and a particular element of shading code. In some examples, the aspects are used in 3-D graphics processing systems that can perform ray tracing based rendering. | 08-21-2014 |