Summed-Area Tables And Their Application to Dynamic Glossy Environment Reflections

Summed-Area Tables
And Their Application to Dynamic
Glossy Environment Reflections
Thorsten Scheuermann
3D Application Research Group
Overview
> Presenting
work started by Justin Hensley, Ph.D.
student at UNC and a 2004 ATI Fellowship recipient
> Summed-area
tables
>
Use for blurring
>
Efficient creation
> Rendering
dynamic reflections
with per-pixel glossiness using
dual-paraboloid maps and
summed-area tables
GDC 2005: Dynamic Glossy Environment Reflections
2
Summed-Area Tables (SATs)
element smn of a summed-area table S contains
the sum of all elements above and to the left of the
original table/texture T [Crow84]
> Each
m
n
smn = ∑∑ tij
i =1 j =1
1
2
3
4
1
2
3
2
1
2
5
2
3
0
1
2
5
8
3
1
3
1
0
6
12 16 19
4
1
4
2
2
7
17 23 28
Original
GDC 2005: Dynamic Glossy Environment Reflections
7
8
11 14
Summed-area table
3
Using a Summed-Area Table
> Summed-area
tables allow filtering the original
texture with an arbitrary box filter in constant time
-
+
UR
UL
LL
+
h
w
LR
GDC 2005: Dynamic Glossy Environment Reflections
LR − UR − LL + UL
average =
w*h
4
Filtering Example Code
float4 tex2D_SAT_blur(sampler tSAT, float2 uv, float2 size)
{
float4 result = tex2D(tSAT, uv + 0.5 * size);
// LR
result -= tex2D(tSAT, uv + float2(0.5, -0.5) * size); // UR
result -= tex2D(tSAT, uv + float2(-0.5, 0.5) * size); // LL
result += tex2D(tSAT, uv - 0.5 * size);
// UL
result /= size.x * size.y;
return result;
}
GDC 2005: Dynamic Glossy Environment Reflections
5
Efficient Summed-Area Table
Creation
> Summed-area
table construction can be decomposed
into horizontal and vertical phase
> Each
phase consists of log2(texture size) passes
> Each
pass adds two elements from the previous pass
> Ping-pong
between two rendertargets for each pass
> Horizontal
Phase:
Pi ( x, y ) = Pi −1 ( x, y ) + Pi −1 ( x − 2
GDC 2005: Dynamic Glossy Environment Reflections
passindex
, y)
6
Horizontal Phase Example
0
0
0
0
1
2
3
4
5
6
7
8
0
0
0
0
0
1
1..2 2..3 3..4 4..5 5..6 6..7 7..8
1
0
0
0
0
1
1..2 1..3 1..4 2..5 3..6 4..7 5..8
2
0
0
0
0
1
1..2 1..3 1..4 1..5 1..6 1..7 1..8
Sampling off the texture returns 0 so
that the sum doesn’t get affected.
GDC 2005: Dynamic Glossy Environment Reflections
Pass
index
7
Saving Render Passes
two samples per pass requires 2*log2(256) =
16 passes for a 256x256 texture
> Adding
> To
reduce number of passes we can add more
samples per pass
> Pass
count reduces to 2*log #samples (texture size)
> Using
16 samples per pass we only need 4 passes for
converting a 256x256 texture to a summed-area
table
>
Converting 512x512 needs 6 passes but two passes only
need to add two texture samples
GDC 2005: Dynamic Glossy Environment Reflections
8
SAT Creation Vertex Shaders
float fPassIndex;
float2 vPixelSize;
float4x4 mVP;
struct VsOutput
{
float4 pos
float4 uv[8]
};
: POSITION0;
: TEXCOORD0; // 16 UVs stuffed in 8 float4’s
VsOutput main(float4 inPos: POSITION, float2 inUV : TEXCOORD0)
{
VsOutput o;
// transform vertex (assuming app has set up screen-aligned
// quad drawing)
o.pos = mul (inPos, mVP);
[…]
GDC 2005: Dynamic Glossy Environment Reflections
9
SAT Creation Vertex Shader
[…]
float passOffset = pow(16.0, fPassIndex) * vPixelSize.x;
// output first two texcoords
o.uv[0].xy = inUV;
o.uv[0].wz = o.uv[0].xy - float2(passOffset, 0);
// compute remaining 14 texcoords for neighboring pixels
for (int i=1; i<8; i++) {
o.uv[i].xy = o.uv[0].xy –
float2((2.0 * i) * passOffset, 0);
o.uv[i].wz = o.uv[0].xy –
float2((2.0 * i + 1.0) * passOffset, 0);
}
return o;
}
GDC 2005: Dynamic Glossy Environment Reflections
10
SAT Creation Pixel Shader
float4 SATPass(sampler tSrc, float4 uv[8])
{
float4 t[8];
// add 16 texture samples with pyramidal scheme
// to maintain precision
for (int i=0; i<8; i++) {
t[i] = tex2D(tSrc, uv[i].xy) +
tex2D(tSrc, uv[i].wz);
}
t[0] += t[1]; t[2] += t[3];
t[4] += t[5]; t[6] += t[7];
t[0] += t[2]; t[4] += t[6];
return t[0]+t[4];
}
GDC 2005: Dynamic Glossy Environment Reflections
11
Precision Requirements
> For
proper reconstruction a summed-area table needs
log2(texture width)+log2(texture height)+bit depth of source texture
bits of precision
> Use
float32-rendertargets to compute and store
summed-area tables
> Precision
errors are unbiased and average out as you
use larger box filter kernels
GDC 2005: Dynamic Glossy Environment Reflections
12
Summed-Area Table Example
GDC 2005: Dynamic Glossy Environment Reflections
13
Boundary Conditions
> To
make sure sampling off the texture does not mess
up the results we need to set up the correct texture
clamping behavior
> Two
possibilities:
>
Clamp to border color with a color of (0, 0, 0, 0)
>
Render a black border around the texture to be converted
into SAT and set Clamp to Edge mode
GDC 2005: Dynamic Glossy Environment Reflections
14
SAT Precision Improvement Trick
> Bias
input texture by -0.5 before
generating summed-area table
> When
reconstructing samples from SAT,
undo bias by adding 0.5 to final result
>
This helps because now we take advantage
of the sign bit
No precision improvement
> Even
better: Use average color of input
as bias (instead of 0.5)
> Improving
precision of summed-area
tables is particularly useful when using
hardware with limited pixel pipeline
precision
GDC 2005: Dynamic Glossy Environment Reflections
With precision improvement
15
Dynamic Glossy Reflections Game
Plan
> Render
dynamic cubemap
> Convert
to dual-paraboloid map
> Convert
dual-paraboloid map faces to summed-area
tables
> Apply
SAT DP-map to glossy object
> Sounds
like a lot of work, but is actually quite fast on
modern hardware
>
Will show real-time demo in a minute
GDC 2005: Dynamic Glossy Environment Reflections
16
The Return of the DualParaboloid Map!
> Dual-paraboloid
map = two textures that store an
environment as reflected in parabolic mirrors
Color
channels
Alpha
channel
GDC 2005: Dynamic Glossy Environment Reflections
17
DP Sampling Shader
float3 texDP (sampler tFront, sampler tBack, float3 dir)
{
// convert 3D lookup vector into 2D texture coordinates
float2 frontUV = float2(0.5, -0.5) * dir.xy /
(1.0 - dir.z) + 0.5;
float2 backUV = float2(0.5, -0.5) * dir.xy /
(1.0 + dir.z) + 0.5;
// sample DP map faces and blend together
float4 cFront = tex2D (tFront, frontUV);
float4 cBack = tex2D (tBack, backUV);
return cFront.rgb + cBack.rgb;
}
GDC 2005: Dynamic Glossy Environment Reflections
18
Cubemap to DP Map Conversion
> Convert
uv position on DP map face to 3D vector
using these equations: (from [Blythe99])
Front face:
> Perform
2u



 2
2
u
v
+
+
1


2v


R=
 u2 + v2 + 1 
 − 1 + u2 + v2 

 2
 v + v2 + 1 
Back face:
 − 2u 
 2

2
+
+
1
u
v


− 2v 

R= 2
 u + v2 + 1 
 1 − u2 − v2 
 2

 u + v2 + 1 
cubemap lookup and store result in DP face
> Precompute
lookup textures or do math on the fly
Front lookup texture
GDC 2005: Dynamic Glossy Environment Reflections
Back lookup texture
19
Why Bother With DP Mapping?
> Summed-area
table concept does not map to
cubemaps which are in spherical domain
>
Summed-area tables only work on rectangular 2D images
> Filtering
with a box filter in dual-paraboloid space
causes less distortion
GDC 2005: Dynamic Glossy Environment Reflections
20
Putting it All Together
GDC 2005: Dynamic Glossy Environment Reflections
21
SAT DP Sampling Shader
> Very
similar to DP sampling, but uses SAT texture
lookup function:
float3 texDP_SAT (sampler tFront, sampler tBack, float3 dir,
float2 filterSize)
{
// convert 3D lookup vector into 2D texture coordinates
float2 frontUV = float2(0.5, -0.5) * dir.xy /
(1.0 - dir.z) + 0.5;
float2 backUV = float2(0.5, -0.5) * dir.xy /
(1.0 + dir.z) + 0.5;
// sample DP map faces and blend together
float4 cFront = tex2D_SAT_blur (tFront, frontUV, filterSize);
float4 cBack = tex2D_SAT_blur (tBack, backUV, filterSize);
float4 cRefl = cFront + cBack;
// normalize result
return cRefl.rgb / cRefl.a;
}
GDC 2005: Dynamic Glossy Environment Reflections
22
Demo
GDC 2005: Dynamic Glossy Environment Reflections
23
Other Possibilities
> Average
several box-filtered environment map
samples to approximate smoother blur filter kernels
> Approximate
Phong BRDF by combining very blurred
sample in normal direction and less blurry sample in
reflection vector direction for specular lobe
GDC 2005: Dynamic Glossy Environment Reflections
24
Direct DP Face Rendering
> Alternative
DP map:
to rendering cubemap, then converting to
> Transform
environment using parabolic projection
function and render directly into DP faces
> Unfortunately
parabolic projection is non-linear and
maps lines to curves
>
Linear rasterization of graphics hardware causes artifacts
>
Might be OK if your geometry is tesselated highly enough
> See
[Coombe04] for details
GDC 2005: Dynamic Glossy Environment Reflections
25
Disadvantages
> Precision
requirements for summed-area tables
> float32
textures used for summed-area tables do not
currently support bilinear filtering
>
Not so much of an issue when you blur enough
>
Can perform bilinear filtering manually in the shader
GDC 2005: Dynamic Glossy Environment Reflections
26
Conclusion
> Using
summed-area tables for blurring
> Efficient
summed-area table generation scheme
> Converting
cubemap into dual-paraboloid maps
> Using
summed-area tables and dual-paraboloid
mapping together to achieve dynamic glossy
environment reflections
GDC 2005: Dynamic Glossy Environment Reflections
27
Acknowledgements
> Justin
Hensley
> Thanks
to Eli Turner for the demo artwork
GDC 2005: Dynamic Glossy Environment Reflections
28
References
> [Crow84]
Crow, F. C., Summed-area tables for
texture mapping. Proceedings of the 11th Annual
Conference on Computer Graphics and Interactive
Techniques ACM Press: 207-212 1984
> [Coombe04]
Coombe, G., Harris, M. and Lastra, A.,
Radiosity on Graphics Hardware. Graphics Interfaces,
2004.
> [Blythe99]
Blythe, D., Advanced Graphics
Programming Techniques Using OpenGL. SIGGRAPH
1999 course notes.
http://www.opengl.org/resources/tutorials/sig99/advanced99/notes/node184.html
GDC 2005: Dynamic Glossy Environment Reflections
29