Graphics Pipeline (glsl)



Yüklə 485 b.
tarix17.10.2017
ölçüsü485 b.
#5383



Graphics Pipeline (GLSL)

  • Graphics Pipeline (GLSL)

  • GPGPU (GLSL)

    • Briefly
  • GPU Computing (CUDA, OpenCL)

  • Choose your own adventure

    • Student Presentation
    • Final Project
  • Goal: Prepare you for your presentation and project



A historical perspective on the graphics pipeline

  • A historical perspective on the graphics pipeline

    • Dimensions of innovation.
    • Where we are today
    • Fixed-function vs programmable pipelines
  • A closer look at the fixed function pipeline

  • We can program the fixed-function pipeline !

    • Some examples
  • What constitutes data and memory, and how access affects program design.







High fragment load / low vertex load

  • High fragment load / low vertex load











Simultaneous rendering to multiple buffers







Not exactly a quantum leap, but…

  • Not exactly a quantum leap, but…

  • Simultaneous rendering to multiple buffers

  • True conditionals and loops

  • Higher precision throughput in the pipeline (64 bits end-to-end, compared to 32 bits earlier.)

  • PCIe bus

  • More memory/program length/texture accesses

  • Texture access by vertex shader



Complete quantum leap

  • Complete quantum leap

  • Ground-up rewrite of GPU

  • Support for DirectX 10, and all it implies (more on this later)

  • Geometry Shader

  • Support for General GPU programming

  • Shared Memory (NVIDIA only)































Not covered today:

  • Not covered today:

    • SM 5 / D3D 11 / GL 4
    • Tessellation shaders
      • *cough* student presentation *cough*
    • Later this semester: NVIDIA Fermi
      • Dual warp scheduler
      • Configurable L1 / shared memory
      • Double precision


Released 01/04/2011

  • Released 01/04/2011

  • http://support.amd.com/us/kbarticles/Pages/AMDSystemMonitor.aspx







Vertices mapped from object space to world space

  • Vertices mapped from object space to world space

  • M = model transformation (scene)

  • V = view transformation (camera)



Lighting information is combined with normals and other parameters at each vertex in order to create new colors.

  • Lighting information is combined with normals and other parameters at each vertex in order to create new colors.



More matrix transformations that operate on a vertex to transform it into the viewport space.

  • More matrix transformations that operate on a vertex to transform it into the viewport space.

  • Note that a vertex may be eliminated from the input stream (if it is clipped).

  • The viewport is two-dimensional: however, vertex z-value is retained for depth testing.



All primitives are now converted to fragments.

  • All primitives are now converted to fragments.

  • Data type change ! Vertices to fragments



The rasterizer produces a stream of fragments.

  • The rasterizer produces a stream of fragments.

  • Each fragment undergoes a series of tests with increasing complexity.



Stencil test: S(x, y) is stencil buffer value for fragment with coordinates (x,y)

  • Stencil test: S(x, y) is stencil buffer value for fragment with coordinates (x,y)

  • If f(S(x,y)), let pixel pass else kill it. Update S(x, y) conditionally depending on f(S(x,y)) and g(D(x,y)).

  • Depth test: D(x, y) is depth buffer value.

  • If g(D(x,y)) let pixel pass else kill it. Update D(x,y) conditionally.



Stencil and depth tests are more general conditionals. Why ?

  • Stencil and depth tests are more general conditionals. Why ?

  • These are the only tests that can change the state of internal storage (stencil buffer, depth buffer).

  • One of the update operations for the stencil buffer is a “count” operation. Remember this!

  • Unfortunately, stencil and depth buffers have lower precision (8, 24 bits resp.)



Blending: pixels are accumulated into final framebuffer storage

  • Blending: pixels are accumulated into final framebuffer storage

  • new-val = old-val op pixel-value

  • If op is +, we can sum all the (say) red components of pixels that pass all tests.

  • Problem: In generation<= IV, blending can only be done in 8-bit channels (the channels sent to the video card); precision is limited.



Color Buffers

  • Color Buffers

    • Front-left
    • Front-right
    • Back-left
    • Back-right
  • Depth Buffer (z-buffer)

  • Stencil Buffer

  • Accumulation Buffer



Scissor Test

  • Scissor Test

    • If(fragment exists inside rectangle)
    • keep
    • Else
    • delete
  • Alpha Test – Compare fragment’s alpha value against reference value

  • Stencil Test – Compare fragment against stencil map

  • Depth Test – Compare a fragment’s depth to the depth value already present in the depth buffer

    • Never
    • Always
    • Less
    • Less-Equal
    • Greater-Equal
    • Greater
    • Not-Equal


What is the output of a “computation” ?

  • What is the output of a “computation” ?

  • Display on screen.

  • Render to buffer and retrieve values (readback)

  • Readbacks are VERY slow !







You are given n sites (p1, p2, p3, … pn) in the plane (think of each site as having a color)

  • You are given n sites (p1, p2, p3, … pn) in the plane (think of each site as having a color)

  • For any point p in the plane, it is closest to some site pj. Color p with color i.

  • Compute this colored map on the plane. In other words,

  • Compute the nearest-neighbour diagram of the sites.







In order to compute the lower envelope, we need to determine, at each pixel, the fragment having the smallest depth value.

  • In order to compute the lower envelope, we need to determine, at each pixel, the fragment having the smallest depth value.

  • This can be done with a simple depth test.

    • Allow a fragment to pass only if it is smaller than the current depth buffer value, and update the buffer accordingly.
  • The fragment that survives has the correct color.



The 1-median of a set of sites is a point q* that minimizes the sum of distances from all sites to itself.

  • The 1-median of a set of sites is a point q* that minimizes the sum of distances from all sites to itself.

  • q* = arg min Σ d(p, q)



Can we compute, for each pixel q, the value

  • Can we compute, for each pixel q, the value

  • F(q) = Σ d(p, q)

  • We can use the cone trick from before, and instead of computing the minimum depth value, compute the sum of all depth values using blending.

  • What’s the catch ?



Using texture interpolation helps here.

  • Using texture interpolation helps here.

  • Instead of drawing a single cone, we draw a shaded cone, with an appropriately constructed texture map.

  • Then, fragment having depth z has color component 1.0 * z.

  • Now we can blend the colors.

  • OpenGL has an aggregation operator that will return the overall min

  • Warning: we are ignoring issues of precision.





Stream data (data associated with vertices and fragments)

  • Stream data (data associated with vertices and fragments)

    • Color/position/texture coordinates.
    • Functionally similar to member variables in a C++ object.
    • Can be used for limited message passing: I modify an object state and send it to you.


Memory “connectivity” in the graphics use of a GPU is tricky.

  • Memory “connectivity” in the graphics use of a GPU is tricky.

  • In a traditional C program, all global variables can be written by all routines.

  • In the fixed-function pipeline, certain data is private.

    • A fragment cannot change a depth or stencil value of a location different from its own.
    • The framebuffer can be copied to a texture; a depth buffer cannot be copied in this way, and neither can a stencil buffer.
    • Only a stencil buffer can count (efficiently)
  • In the fixed-function pipeline, depth and stencil buffers can be used in a multi-pass computation only via readbacks.

  • A texture cannot be written directly.

  • In programmable GPUs, the memory connectivity becomes more open, but there are still constraints.

  • Understanding access constraints and memory “connectivity” is a key step in programming the GPU.



The most important question to ask when programming the GPU is:

  • The most important question to ask when programming the GPU is:

  • What can I do in one pass ?

  • Limitations on memory connectivity mean that a step in a computation may often have to be deferred to a new pass.

  • For example, when computing the second smallest element, we could not store the current minimum in read/write memory.

  • Thus, the “communication” of this value has to happen across a pass.





Yüklə 485 b.

Dostları ilə paylaş:




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©www.genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə