Optimizing the Graphics Pipeline with Compute, GDC 2016Graham Wihlidal
?
With further advancement in the current console cycle, new tricks are being learned to squeeze the maximum performance out of the hardware. This talk will present how the compute power of the console and PC GPUs can be used to improve the triangle throughput beyond the limits of the fixed function hardware. The discussed method shows a way to perform efficient "just-in-time" optimization of geometry, and opens the way for per-primitive filtering kernels and procedural geometry processing.
Takeaway:
Attendees will learn how to preprocess geometry on-the-fly per frame to improve rendering performance and efficiency.
Intended Audience:
This presentation is targeting seasoned graphics developers. Experience with DirectX 12 and GCN is recommended, but not required.
An introduction to Realistic Ocean Rendering through FFT - Fabio Suriano - Co...Codemotion
?
This document provides an overview of ocean rendering techniques using Fast Fourier Transforms (FFT). It discusses how games and movies like Assassin's Creed, Crysis, and Titanic simulate ocean water through an FFT-based spectrum representation. The FFT is used to compute the discrete Fourier transform and its inverse to transform between frequency and time domains. Grid sizes of 128-512 are common. Ocean shading considers reflection and refraction using Fresnel equations. Level of detail is achieved through a top-down projected grid for consistent near-far resolution without artifacts.
This document discusses advancements in tiled-based compute rendering. It describes current proven tiled rendering techniques used in games. It then discusses opportunities for improvement like using parallel reduction to calculate depth bounds more efficiently than atomics, improved light culling techniques like modified Half-Z, and clustered rendering which divides the screen into tiles and slices to reduce lighting workloads. The document concludes clustered shading has potential savings on culling and offers benefits over traditional 2D tiling.
This document discusses two approaches to deblurring digital images: blind deconvolution and Lucy Richardson deconvolution. Blind deconvolution aims to restore an image and estimate the point spread function without prior knowledge, using an iterative process. Lucy Richardson deconvolution is effective when the point spread function is known but noise properties are uncertain, as it reduces noise amplification. Both techniques are limited by having only a single blurred image as input. Results are shown applying each algorithm to example blurred images.
The document provides an overview of graphics programming on the Xbox 360, including details about the system and GPU architecture, graphics APIs like Direct3D, shader development, and tools for graphics debugging and optimization like PIX. Key points include that the Xbox 360 GPU is designed by ATI and includes 10MB of EDRAM, supports shader model 3.0, and has dedicated hardware for features like tessellation, procedural geometry, and anti-aliasing. Direct3D is optimized for the Xbox 360 hardware and exposes new features. PIX is a powerful tool for performance analysis and debugging graphics applications on the Xbox 360.
Bill explains some of the ways that the Vertex Shader can be used to improve performance by taking a fast path through the Vertex Shader rather than generating vertices with other parts of the pipeline in this AMD technology presentation from the 2014 Game Developers Conference in San Francisco March 17-21. Check out more technical presentations at http://developer.amd.com/resources/documentation-articles/conference-presentations/
A description of the next-gen rendering technique called Triangle Visibility Buffer. It offers up to 10x - 20x geometry compared to Deferred rendering and much higher resolution. Generally it aligns better with memory access patterns in modern GPUs compared to Deferred Lighting like Clustered Deferred Lighting etc.
Graphics Gems from CryENGINE 3 (Siggraph 2013)Tiago Sousa
?
This lecture covers rendering topics related to Crytek¡¯s latest engine iteration, the technology which powers titles such as Ryse, Warface, and Crysis 3. Among covered topics, Sousa presented SMAA 1TX: an update featuring a robust and simple temporal antialising component; performant and physically-plausible camera related post-processing techniques such as motion blur and depth of field were also covered.
This document provides an overview of physics engine usage for game physics programming. It introduces key concepts in physics such as kinematics, calculus, and numeric solutions. It also summarizes the Box2D and PhysX physics engines, including collision detection methods, constraints, joints and example code reviews. The goal is to provide background information for implementing physics in games without focusing on specific implementation details.
This document discusses various optimizations for the z-buffer algorithm used in 3D graphics rendering. It covers hardware optimizations like early-z testing and double-speed z-only rendering. It also discusses software techniques like front-to-back sorting, early-z rendering passes, and deferred shading. Other topics include z-buffer compression, fast clears, z-culling, and potential future optimizations like programmable culling units. A variety of resources are provided for further reading.
This document provides recommendations for optimizing DirectX 11 performance. It separates the graphics pipeline process into offline and runtime stages. For the offline stage, it recommends creating resources like buffers, textures and shaders on multiple threads. For the runtime stage, it suggests culling unused objects, minimizing state changes, and pushing commands to the driver quickly. It also provides tips for updating dynamic resources efficiently and grouping related constants together. The goal is to keep the CPU and GPU pipelines fully utilized for maximum performance.
This document discusses very deep convolutional networks for large-scale image recognition. It describes network configurations that use 3x3 convolutional filters with max pooling layers and fully connected layers. The networks have 11 or 19 weight layers and use 1x1 convolutional filters to introduce nonlinearity. Classification experiments on ImageNet data with over 1 million training images achieve top-1 and top-5 error rates.
This talk provides additional details around the hybrid real-time rendering pipeline we developed at SEED for Project PICA PICA.
At Digital Dragons 2018, we presented how leveraging Microsoft's DirectX Raytracing enables intuitive implementations of advanced lighting effects, including soft shadows, reflections, refractions, and global illumination. We also dove into the unique challenges posed by each of those domains, discussed the tradeoffs, and evaluated where raytracing fits in the spectrum of solutions.
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)Takahiro Harada
?
The document discusses porting a Monte Carlo ray tracing application to OpenCL to take advantage of GPU acceleration. Some of the key challenges discussed include data structure changes needed for GPUs, writing OpenCL kernels that map well to the GPU architecture, and avoiding SIMD divergence to maintain high hardware utilization. The talk will cover strategies for addressing these challenges to get good performance from an OpenCL implementation.
The document discusses approaches for reducing driver overhead in OpenGL applications. It introduces several OpenGL APIs that can be used to achieve this, including persistent mapped buffers for dynamic geometry, multi-draw indirect for batching draw calls, and packing 2D textures into arrays. Speakers then provide details on implementing these techniques and the performance improvements they provide, such as reducing overhead by 5-10x and allowing an order of magnitude more unique objects per frame. Bindless textures and sparse textures are also covered as advanced methods for further optimizing texture handling and memory usage.
Example of iterative deepening search & bidirectional searchAbhijeet Agarwal
?
There are the some examples of Iterative deepening search & Bidirectional Search with some definitions and some theory related to the both searches. If you have any query please ask in comment or mail i will be happy to help you
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasAMD Developer Central
?
The document discusses faster particle rendering using DirectCompute. It describes using the GPU for particle simulation by taking advantage of its parallel processing capabilities. It discusses using compute shaders to simulate particle behavior, handle collisions via the depth buffer, sort particles using bitonic sort, and render particles in tiles via DirectCompute to avoid overdraw from large particles. Tiled rendering involves culling particles, building per-tile particle indices, and sorting particles within each tile before shading them in parallel threads to composite onto the scene.
Hough transform has vital role in curve fitting and lines detecting.this ppt is focused on linear Hough transform and its implementation using MATLAB,education
This presentation introduces a new NVIDIA extension called Command-list.
The purpose of this presentation is to explain the basic concepts on how to use it and show what are the benefits.
The sample I used for the talk is here: https://github.com/nvpro-samples/gl_commandlist_bk3d_models
The driver for trying should be PreRelease 347.09
http://www.nvidia.com/download/driverResults.aspx/80913/en-us
The document provides an overview of graphics programming on the Xbox 360, including details about the system and GPU architecture, graphics APIs like Direct3D, shader development, and tools for graphics debugging and optimization like PIX. Key points include that the Xbox 360 GPU is designed by ATI and includes 10MB of EDRAM, supports shader model 3.0, and has dedicated hardware for features like tessellation, procedural geometry, and anti-aliasing. Direct3D is optimized for the Xbox 360 hardware and exposes new features. PIX is a powerful tool for performance analysis and debugging graphics applications on the Xbox 360.
Bill explains some of the ways that the Vertex Shader can be used to improve performance by taking a fast path through the Vertex Shader rather than generating vertices with other parts of the pipeline in this AMD technology presentation from the 2014 Game Developers Conference in San Francisco March 17-21. Check out more technical presentations at http://developer.amd.com/resources/documentation-articles/conference-presentations/
A description of the next-gen rendering technique called Triangle Visibility Buffer. It offers up to 10x - 20x geometry compared to Deferred rendering and much higher resolution. Generally it aligns better with memory access patterns in modern GPUs compared to Deferred Lighting like Clustered Deferred Lighting etc.
Graphics Gems from CryENGINE 3 (Siggraph 2013)Tiago Sousa
?
This lecture covers rendering topics related to Crytek¡¯s latest engine iteration, the technology which powers titles such as Ryse, Warface, and Crysis 3. Among covered topics, Sousa presented SMAA 1TX: an update featuring a robust and simple temporal antialising component; performant and physically-plausible camera related post-processing techniques such as motion blur and depth of field were also covered.
This document provides an overview of physics engine usage for game physics programming. It introduces key concepts in physics such as kinematics, calculus, and numeric solutions. It also summarizes the Box2D and PhysX physics engines, including collision detection methods, constraints, joints and example code reviews. The goal is to provide background information for implementing physics in games without focusing on specific implementation details.
This document discusses various optimizations for the z-buffer algorithm used in 3D graphics rendering. It covers hardware optimizations like early-z testing and double-speed z-only rendering. It also discusses software techniques like front-to-back sorting, early-z rendering passes, and deferred shading. Other topics include z-buffer compression, fast clears, z-culling, and potential future optimizations like programmable culling units. A variety of resources are provided for further reading.
This document provides recommendations for optimizing DirectX 11 performance. It separates the graphics pipeline process into offline and runtime stages. For the offline stage, it recommends creating resources like buffers, textures and shaders on multiple threads. For the runtime stage, it suggests culling unused objects, minimizing state changes, and pushing commands to the driver quickly. It also provides tips for updating dynamic resources efficiently and grouping related constants together. The goal is to keep the CPU and GPU pipelines fully utilized for maximum performance.
This document discusses very deep convolutional networks for large-scale image recognition. It describes network configurations that use 3x3 convolutional filters with max pooling layers and fully connected layers. The networks have 11 or 19 weight layers and use 1x1 convolutional filters to introduce nonlinearity. Classification experiments on ImageNet data with over 1 million training images achieve top-1 and top-5 error rates.
This talk provides additional details around the hybrid real-time rendering pipeline we developed at SEED for Project PICA PICA.
At Digital Dragons 2018, we presented how leveraging Microsoft's DirectX Raytracing enables intuitive implementations of advanced lighting effects, including soft shadows, reflections, refractions, and global illumination. We also dove into the unique challenges posed by each of those domains, discussed the tradeoffs, and evaluated where raytracing fits in the spectrum of solutions.
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)Takahiro Harada
?
The document discusses porting a Monte Carlo ray tracing application to OpenCL to take advantage of GPU acceleration. Some of the key challenges discussed include data structure changes needed for GPUs, writing OpenCL kernels that map well to the GPU architecture, and avoiding SIMD divergence to maintain high hardware utilization. The talk will cover strategies for addressing these challenges to get good performance from an OpenCL implementation.
The document discusses approaches for reducing driver overhead in OpenGL applications. It introduces several OpenGL APIs that can be used to achieve this, including persistent mapped buffers for dynamic geometry, multi-draw indirect for batching draw calls, and packing 2D textures into arrays. Speakers then provide details on implementing these techniques and the performance improvements they provide, such as reducing overhead by 5-10x and allowing an order of magnitude more unique objects per frame. Bindless textures and sparse textures are also covered as advanced methods for further optimizing texture handling and memory usage.
Example of iterative deepening search & bidirectional searchAbhijeet Agarwal
?
There are the some examples of Iterative deepening search & Bidirectional Search with some definitions and some theory related to the both searches. If you have any query please ask in comment or mail i will be happy to help you
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasAMD Developer Central
?
The document discusses faster particle rendering using DirectCompute. It describes using the GPU for particle simulation by taking advantage of its parallel processing capabilities. It discusses using compute shaders to simulate particle behavior, handle collisions via the depth buffer, sort particles using bitonic sort, and render particles in tiles via DirectCompute to avoid overdraw from large particles. Tiled rendering involves culling particles, building per-tile particle indices, and sorting particles within each tile before shading them in parallel threads to composite onto the scene.
Hough transform has vital role in curve fitting and lines detecting.this ppt is focused on linear Hough transform and its implementation using MATLAB,education
This presentation introduces a new NVIDIA extension called Command-list.
The purpose of this presentation is to explain the basic concepts on how to use it and show what are the benefits.
The sample I used for the talk is here: https://github.com/nvpro-samples/gl_commandlist_bk3d_models
The driver for trying should be PreRelease 347.09
http://www.nvidia.com/download/driverResults.aspx/80913/en-us
LPV (Light Propagation Volume) is a technique that approximates global illumination in real-time. It involves several steps: 1) Downsampling RSM textures and injecting point lights and blocking potential into LPV volume textures, 2) Propagating light through volume textures in an iterative manner, 3) Rendering indirect lighting using the final volume textures. The key aspects are approximating lights using spherical harmonics for efficient storage in volume textures and accumulating light using additive blending to simulate light propagation.
Temporal Anti-Aliasing (TAA) is a technique that reduces aliasing by combining the current frame image with past frame images. It has the advantage of achieving anti-aliasing with low cost compared to other AA methods, but can produce blurred images when combining frames. TAA works by jittering the scene position each frame, combining the rendered image with past frames stored in a history buffer, and copying the result to the frame buffer. While effective for static scenes, it can cause ghosting artifacts in dynamic scenes where past frame images no longer align due to camera or object motion.
5. Spherical Harmonics 5
return float4( irradiance, 1.f );
}
????????Irradiance Map????????.
?????Irradiance Map????????????????.
float3 ImageBasedLight( float3 normal )
{
return IrradianceMap.Sample( LinearSampler, normal ).rgb;
}
// ...
float4 lightColor = float4( ImageBasedLight( normal ), 1.f ) * MoveLinearSpace( Diffuse );
??????????????????Irradiance Map???. ??????Irradiance Map????24KB( 32 *
32 * 6 * 4Byte )????????????. ?????????????????108Byte( 3 * 9 * 4Byte )????
??????????????????. ???????????????????????.
?????????(Associated Legendre Polynomials)
?????????????????????????????????????. ????????????
????????????????.
??????????????? ?
?????????? ?
? ?
?????????????????.
??? ?
?-1 ~ 1????????????????????????.
? ????????????????????????????????.
????? ?
? ?
??????????(Bands)???????? ?
?Band Index???0??????????
??????. ??? ?
?0 ~ ???????????????.
??????? Irradiance Map
P l m
P ?
(x)
l
m
x
l m l
m l
P ?
0
0
P ?
P ?
1
0
1
1
P ?
P ?
P ?
2
0
2
1
2
2
6. Spherical Harmonics 6
???????????????????????????????????????????. ????3??
????????????.
?????????????????????????????????????????. ??? ?
?????
?????????????????????.
????????????????.
?
???1?????????????????3?????????????????????????.
???3??????????????????????????.
double P(int l, int m, double x)
{
// evaluate an Associated Legendre Polynomial P(l,m,x) at x
double pmm = 1.0;
if (m > 0)
{
double somx2 = sqrt((1.0 - x) * (1.0 + x))
double fact = 1.0;
for (int i = 1; i <= m; i++)
{
pmm *= (-fact) * somx2;
fact += 2.0;
}
}
if (l == m) return pmm;
double pmmp1 = x * (2.0 * m + 1.0) * pmm;
if (l == m + 1) return pmmp1;
double pll = 0.0;
for (int ll = m + 2; ll <= l; ++ll)
{
pll = ((2.0 * ll - 1.0) * x * pmmp1 - (ll + m - 1.0) * pmm) / (ll - m);
pmm = pmmp1;
pmmp1 = pll;
}
return pll;
}
??????
Spherical Harmonics???????????????????????????.
???????????????????????????????????????????????????
?. ??????????????????????????.
????????????????.
1 P ? =
m
m
(?1) (2m ?
m
1)!!(1 ? x )
2 m/2
!!
n!! = { ?
n ? (n ? 2)...5 ? 3 ? 1, n > 0 odd
n ? (n ? 2)...6 ? 4 ? 2, n > 0 even
2 P ? =
m+1
m
x(2m + 1)P ?
m
m
3 (l ? m)P ?
=
l
m
x(2l ? 1)P ?
?
l?1
m
(l + m ? 1)P ?
l?2
m
P ?
(x)
0
0
Y ?
(¦È,?) :=
l
m
AP ?
(cos¦È)e
l
m im?
7. Spherical Harmonics 7
??? ?
?????????????????? ?
??????????????.
?????????? ?
? ?
?????????????????????. ?
????0???????????
?? ?
? ?
? ?
????????.
?????????????????.
??????????????????????????????????????????????????
????.
??????????????????.
double K(int l, int m)
{
// renormalisation constant for SH function
double temp = ((2.0 * l + 1.0) * factorial(l - m)) / (4.0 * PI * factorial(l + m));
return sqrt(temp);
}
double SH(int l, int m, double theta, double phi)
{
// return a point sample of a Spherical Harmonic basis function
// l is the band, range [0..N]
// m in the range [-l..l]
// theta in the range [0..Pi]
y ?
(¦È,¦Õ) =
l
m
{ ?
?
K ?
cos(m¦Õ)P ?
(cos¦È), m > 0
2 l
m
l
m
?
K ?
sin(?m¦Õ)P ?
(cos¦È), m < 0
2 l
m
l
m
K ?
P ?
(cos¦È), m = 0
l
0
l
0
P K
K ? =
l
m
?
?
4¦Ð
(2l + 1)
(l + ¨Om¨O)!
(l ? ¨Om¨O)!
l m l
m ?l l
y ?
0
0
y y ? y ?
1
?1
1
0
1
1
y ? y ? y ? y ? y ?
2
?2
2
?1
2
0
2
1
2
2
??: https://users.soe.ucsc.edu/~pang/160/s13/projects/bgabin/Final/report/Spherical Harmonic Lighting Comparison.htm
8. Spherical Harmonics 8
// phi in the range [0..2*Pi]
const double sqrt2 = sqrt(2.0);
if (m == 0) return K(l, 0) * P(l, m, cos(theta));
else if (m > 0) return sqrt2 * K(l, m) * cos(m * phi) * P(l, m, cos(theta));
else return sqrt2 * K(l, -m) * sin(-m * phi) * P(l, -m, cos(theta));
}
???????????????????????????????????. ???????????????
????????????????????????????????. ?????????????????
??????????????.
???????????????????????????????. ??? ?
????????????.
??(Projection)
???????????????????. ??????????????????Irradiance Map?????
????????????????? ???????(Basis Function)???(Projection)???????????
??.
????????????????????????????????. ?????????????????
???????????????????.
?????????????????????????. ????????????????????????
?????????????????.
????????????????????.
(x,y,z) = (sin¦Ècos?,sin¦Èsin?,cos¦È)
l = 2
y ?
( ) =
0
0
n 0.282095
y ?
( ) =
1
?1
n 0.488603y
y ?
( ) =
1
0
n 0.488603z
y ?
( ) =
1
1
n 0.488603x
y ?
( ) =
2
?2
n 1.092548xy
y ?
( ) =
2
?1
n 1.092548yz
y ?
( ) =
2
0
n 0.315392(3z ?
2
1)
y ?
( ) =
2
1
n 1.092548xz
y ?
( ) =
2
2
n 0.546274(x ?
2
y )
2
f(x) = c ?
b ?
(x) +
1 1 c ?
b ?
(x) +
2 2 c ?
b ?
(x) +
3 3 c ?
b ?
(x) +
4 4 ...
c ? =
i f(x)b ?
(x)
¡Ò i
9. Spherical Harmonics 9
??????????????????????????????.
???????????????????Irradiance Map??????????.
??????????Irradiance Map
??????????Irradiance Map ??????????.
??: Spherical Harmonic Lighting: The Gritty Details
??: Spherical Harmonic Lighting: The Gritty Details
??: Spherical Harmonic Lighting: The Gritty Details
? ? L ?
(x,¦Ø ?
)cos(¦È)sin(¦È)d¦Èd?
¦Ð
1
¡Ò
?=0
2¦Ð
¡Ò
¦È=0
?
2
¦Ð
i i
10. Spherical Harmonics 10
?????????2???????.
1. ?¡ú ??????????????????????????.
2. ?¡ú ???????????. ??????Radiance???????????????0??
??????Radiance??????.
???????????????????????????.
??? ?
????(Zenith Angle)?????????? ?
???0????. ???????? ? ?
??
?Irradiance?????????. Irradiance?????????? ?
????Irradiance?????????
??.
?? ?
? ?
? ?
??????????????.
??? ?
?????????? ?
?????????????????????????????. ?
???????????4??4.A??????????.
?????????? ?
???????????.
?
????????????????????.
???????????????????????????????????????.
? ? L ?
(x,¦Ø ?
)sin(¦È)d¦Èd?
¡Ò?=0
2¦Ð
¡Ò¦È=0
¦Ð
i i
max(cos(¦È),0)
L ? =
lm ? ? L(¦È,?)y ?
(¦È,?)sin(¦È)d¦Èd?
¡Ò
?=0
2¦Ð
¡Ò
¦È=0
¦Ð
l
m
A ? =
l ? max(cos(¦È),0)y ?
(¦È,0)d¦È
¡Ò
¦È=0
¦Ð
l
0
cos(¦È) m Llm A ?
l
E ?
lm
E(¦È,?) = ? E ?
y ?
(¦È,?)
l,m
¡Æ lm l
m
E ?
lm L ?
lm A ?
l
E ?
=
lm ?
A ?
L ?
?
2l + 1
4¦Ð
l lm
?
?
2l+1
4¦Ð
cos(¦È)
?
A ?
l
^
?
=
A ?
l
^ ?
A ?
?
2l + 1
4¦Ð
l
?
A ?
l
^
? =
A ?
0
^ 3.1415
? =
A ?
1
^ 2.0943
? =
A ?
2
^ 0.7853
? =
A ?
3
^ 0
?
=
A ?
4
^ ?0.1309
? =
A ?
5
^ 0
?
=
A ?
6
^ 0.0490
11. Spherical Harmonics 11
??? ?
??????????????????????27????(RGB ??3?* SH ??9?)???
Irradiance Map?????????. ???????????????? ?
???????????????
??.
void ShFunctionL2( float3 v, out float Y[9] )
{
// L0
Y[0] = 0.282095f; // Y_00
// L1
Y[1] = 0.488603f * v.y; // Y_1-1
Y[2] = 0.488603f * v.z; // Y_10
Y[3] = 0.488603f * v.x; // Y_11
// L2
Y[4] = 1.092548f * v.x * v.y; // Y_2-2
Y[5] = 1.092548f * v.y * v.z; // Y_2-1
Y[6] = 0.315392f * ( 3.f * v.z * v.z - 1.f ) ; // Y_20
Y[7] = 1.092548f * v.x * v.z; // Y_21
Y[8] = 0.546274f * ( v.x * v.x - v.y * v.y ); // Y_22
}
??? ?
????????????????????. ?????????????????????.
#include "Common/Constants.fxh"
#include "SH/SphericalHarmonics.fxh"
??: On the relationship between radiance and irradiance: determining the illumination from images of a convex Lambertian object
l ¡Ü 2
y ?
( )
l
m
n
L ?
lm
L ?
=
lm ? ? ? L(¦È,?)y ?
(¦È,?)sin(¦È)d¦Èd?
¦Ð
1
¡Ò
?=0
2¦Ð
¡Ò
¦È=0
¦Ð
l
m
= ? ? ? ? ? L(¦È,?)y ?
(¦È,?)sin(¦È)
¦Ð
1
n1
2¦Ð
n2
¦Ð
?=0
¡Æ
n1
¦È=0
¡Æ
n2
l
m
= ? ? ?
L(¦È,?)y ?
(¦È,?)sin(¦È)
n1n2
2¦Ð
?=0
¡Æ
n1
¦È=0
¡Æ
n2
l
m
12. Spherical Harmonics 12
TextureCube CubeMap : register( t0 );
SamplerState LinearSampler : register( s0 );
RWStructuredBuffer<float3> Coeffs : register( u0 );
static const int ThreadGroupX = 16;
static const int ThreadGroupY = 16;
static const float3 Black = (float3)0;
static const float SampleDelta = 0.025f;
static const float DeltaPhi = SampleDelta * ThreadGroupX;
static const float DeltaTheta = SampleDelta * ThreadGroupY;
groupshared float3 SharedCoeffs[ThreadGroupX * ThreadGroupY][9];
groupshared int TotalSample;
[numthreads(ThreadGroupX, ThreadGroupY, 1)]
void main( uint3 GTid: SV_GroupThreadID, uint GI : SV_GroupIndex)
{
if ( GI == 0 )
{
TotalSample = 0;
}
GroupMemoryBarrierWithGroupSync();
float3 coeffs[9] = { Black, Black, Black, Black, Black, Black, Black, Black, Black };
int numSample = 0;
for ( float phi = GTid.x * SampleDelta; phi < 2.f * PI; phi += DeltaPhi )
{
for ( float theta = GTid.y * SampleDelta; theta < PI; theta += DeltaTheta )
{
float3 sampleDir = normalize( float3( sin( theta ) * cos( phi ), sin( theta ) * sin( phi ), cos( theta ) ) );
float3 radiance = CubeMap.SampleLevel( LinearSampler, sampleDir, 0 ).rgb;
float y[9];
ShFunctionL2( sampleDir, y );
[unroll]
for ( int i = 0; i < 9; ++i )
{
coeffs[i] += radiance * y[i] * sin( theta );
}
++numSample;
}
}
int sharedIndex = GTid.y * ThreadGroupX + GTid.x;
[unroll]
for ( int i = 0; i < 9; ++i )
{
SharedCoeffs[sharedIndex][i] = coeffs[i];
coeffs[i] = Black;
}
InterlockedAdd( TotalSample, numSample );
GroupMemoryBarrierWithGroupSync();
if ( GI == 0 )
{
for ( int i = 0; i < ThreadGroupX * ThreadGroupY; ++i )
{
[unroll]
for ( int j = 0; j < 9; ++j )
{
coeffs[j] += SharedCoeffs[i][j];
}
}
float dOmega = 2.f * PI / float( TotalSample );
[unroll]
for ( int i = 0; i < 9; ++i )
{
14. Spherical Harmonics 14
???
?????????????.
???????????????????.
GitHub - xtozero/SSR at irradiance_map
Screen Space Reflection. Contribute to xtozero/SSR development by creating an account on
GitHub.
https://github.com/xtozero/ssr/tree/irradiance_map
Reference
1. Diffuse irradiance
2. Spherical Harmonic Lighting: The Gritty Details
3. An Efficient Representation for Irradiance Environment Maps
4. On the relationship between radiance and irradiance: determining the illumination from images of a convex
Lambertian object
5. Diffuse IrradianceMap?Spherical harmonics??????
?
??
?
??