Patent application number | Description | Published |
20140327683 | Graphics Processor with Non-Blocking Concurrent Architecture - In some aspects, systems and methods provide for forming groupings of a plurality of independently-specified computation workloads, such as graphics processing workloads, and in a specific example, ray tracing workloads. The workloads include a scheduling key, which is one basis on which the groupings can be formed. Workloads grouped together can all execute from the same source of instructions, on one or more different private data elements. Such workloads can recursively instantiate other workloads that reference the same private data elements. In some examples, the scheduling key can be used to identify a data element to be used by all the workloads of a grouping. Memory conflicts to private data elements are handled through scheduling of non-conflicted workloads or specific instructions and/or deferring conflicted workloads instead of locking memory locations. | 11-06-2014 |
20150089156 | Atomic Memory Update Unit & Methods - In an aspect, an update unit can evaluate condition(s) in an update request and update one or more memory locations based on the condition evaluation. The update unit can operate atomically to determine whether to effect the update and to make the update. Updates can include one or more of incrementing and swapping values. An update request may specify one of a pre-determined set of update types. Some update types may be conditional and others unconditional. The update unit can be coupled to receive update requests from a plurality of computation units. The computation units may not have privileges to directly generate write requests to be effected on at least some of the locations in memory. The computation units can be fixed function circuitry operating on inputs received from programmable computation elements. The update unit may include a buffer to hold received update requests. | 03-26-2015 |
20150339798 | Graphics Processor with Non-Blocking Concurrent Architecture - In some aspects, systems and methods provide for forming groupings of a plurality of independently-specified computation workloads, such as graphics processing workloads, and in a specific example, ray tracing workloads. The workloads include a scheduling key, which is one basis on which the groupings can be formed. Workloads grouped together can all execute from the same source of instructions, on one or more different private data elements. Such workloads can recursively instantiate other workloads that reference the same private data elements. In some examples, the scheduling key can be used to identify a data element to be used by all the workloads of a grouping. Memory conflicts to private data elements are handled through scheduling of non-conflicted workloads or specific instructions and/or deferring conflicted workloads instead of locking memory locations. | 11-26-2015 |
Patent application number | Description | Published |
20110069067 | SYSTEMS AND METHODS FOR SELF-INTERSECTION AVOIDANCE IN RAY TRACING - Aspects include systems, methods, and media for implementing methods relating to detection of invalid intersections during ray tracing. Invalid intersections can arise from imprecision in computer-based number representation, causing ray origins to be located inappropriately. In some aspects, a ray can be associated with information relating to an expected angle between the ray's direction and a normal for a to-be-identified primitive intersected by that ray. If the angle between the ray's direction and the normal of an intersected primitive is within expectations, then that information can be used in predicting whether the intersection is valid. Such expectation information can be presented as a single bit determined by a shader performing a dot product of the ray and a normal of a primitive intersected by a parent ray, or can be obtained as a by-product of ray/primitive intersection testing. Such information also can be based on whether the shader is emitting to have reflection or refraction type ray behavior. | 03-24-2011 |
20110267347 | SYSTEMS AND METHODS FOR PRIMITIVE INTERSECTION IN RAY TRACING - Aspects include systems, methods, and media for implementing methods relating to increasing consistency of results during intersection testing. In an example, vertexes define edges of primitives composing a scene (e.g., triangles defining a mesh for a surface of an object in a 3-D scene). An edge can be shared between two primitives. Intersection testing algorithms can use tests involving edges to determine whether or not the ray intersects a primitive defined by those edges. In one approach, a precedence among the vertexes defining a particular edge is enforced for such intersection testing. The precedence causes an intersection tester to always test a given edge in the same orientation, regardless of which primitive defined (at least in part) by that edge is being intersection tested. | 11-03-2011 |
20110299642 | NOISE SHAPED INTERPOLATOR AND DECIMATOR APPARATUS AND METHOD - Improved interpolator and decimator apparatus and methods, including the addition of an elastic storage element in the signal path. In one exemplary embodiment, the elastic element comprises a FIFO which advantageously allows short term variation in sample clocks to be absorbed, and also provides a feedback mechanism for controlling a delta-sigma modulated modulo-N counter based sample clock generator. The elastic element combined with a delta-sigma modulator and counter creates a noise-shaped frequency lock loop without additional components, resulting in a much simplified interpolator and decimator. | 12-08-2011 |
20120133654 | VARIABLE-SIZED CONCURRENT GROUPING FOR MULTIPROCESSING - Aspects include, for example, a method for interpreting information in a computer program, or profiling such a program to estimate a group size for instances of that program (program module, or portion thereof). Such a method can be used in a system that supports collecting outputs of executing instances, where those outputs can specify new program instances. Scheduling of new instances (or allocation of resources for executing such instances) can be deferred. A trigger to begin scheduling (or allocation) for a collection of instances uses a target group size for that program. Thus, different programs can have different group sizes, which can be set explicitly, or based on profiling. The profiling can occur during one or more of pre-execution and during execution. The group size estimate can be an input into an algorithm that also accounts for system state during execution. | 05-31-2012 |
20130069960 | MULTISTAGE COLLECTOR FOR OUTPUTS IN MULTIPROCESSOR SYSTEMS - Aspects include a multistage collector to receive outputs from plural processing elements. Processing elements may comprise (each or collectively) a plurality of clusters, with one or more ALUs that may perform SIMD operations on a data vector and produce outputs according to the instruction stream being used to configure the ALU(s). The multistage collector includes substituent components each with at least one input queue, a memory, a packing unit, and an output queue; these components can be sized to process groups of input elements of a given size, and can have multiple input queues and a single output queue. Some components couple to receive outputs from the ALUs and others receive outputs from other components. Ultimately, the multistage collector can output groupings of input elements. Each grouping of elements (e.g., at input queues, or stored in the memories of component) can be formed based on matching of index elements. | 03-21-2013 |
20130222402 | Graphics Processor with Non-Blocking Concurrent Architecture - In some aspects, systems and methods provide for forming groupings of a plurality of independently-specified computation workloads, such as graphics processing workloads, and in a specific example, ray tracing workloads. The workloads include a scheduling key, which is one basis on which the groupings can be formed. Workloads grouped together can all execute from the same source of instructions, one or more different private data elements. Such workloads can recursively instantiate other workloads that reference the same private data elements. In some examples, the scheduling key can be used to identify a data element to be used by all the workloads of a grouping. Memory conflicts to private data elements are handled through scheduling of non-conflicted workloads or specific instructions an deferring conflicted workloads instead of locking memory locations. | 08-29-2013 |
20140269760 | System And Method of Arbitrating Access to Interconnect - Aspects relate to arbitrating access to an interconnect among multiple ports. For example, input ports receive requests for access to identified destination ports and buffer these in one or more FIFOs. A picker associated with respective FIFO(s) begins an empty arbitration packet that includes a location for each output port and fills one or more locations in the packet, such as based on a prioritization scheme. Each packet is passed in a ring to another picker, which performs a fill that does not conflict with previously filled locations in that packet. Each picker has an opportunity to place requests in each of the packets. Results of the arbitration are dispatched to reorder buffers associated with respective output ports and used to schedule the interconnect. Each arbitration cycle thus produces a set of control information for an interconnect to be used in subsequent data transfer steps. | 09-18-2014 |
20140286467 | NOISE SHAPED INTERPOLATOR AND DECIMATOR APPARATUS AND METHOD - Interpolator and decimator apparatuses and methods are improved by the addition of an elastic storage element in the signal path. In one exemplary embodiment, the elastic element comprises a FIFO which advantageously allows short term variation in sample clocks to be absorbed, and also provides a feedback mechanism for controlling a delta-sigma modulated modulo-N counter based sample clock generator. The elastic element combined with a delta-sigma modulator and counter creates a noise-shaped frequency lock loop without additional components, resulting in a much simplified interpolator and decimator. | 09-25-2014 |
20150341159 | NOISE SHAPED INTERPOLATOR AND DECIMATOR APPARATUS AND METHOD - An interpolator or decimator includes an elastic storage element in the signal path between first and second clock domains. The elastic element may, for example, be a FIFO which advantageously allows short term variation in sample clocks to be absorbed. A feedback mechanism controls a delta-sigma modulated modulo-N counter based sample clock generator. The elastic element combined with a delta-sigma modulator and counter creates a noise-shaped frequency lock loop without additional components, resulting in a much simplified interpolator and decimator. | 11-26-2015 |