Home > 3D | GPU | OGLPlus | OpenGL > OGLPlus tutorial:Deferred Renderer

OGLPlus tutorial:Deferred Renderer

This time we will make something more challenging using OGLPlus as OpenGL API.Well,not really challenging but still not as simple as drawing a rectangle.We will try simple deferred renderer. First,for the amateurs,brief explanation of what Deferred Renderer actually means.Imagine, you decide to render a scene with let’s say 100 light sources using a typical (called forward rendering) approach.Many years ago,when there was only fixed pipeline,you would end up with just 8 lights at most.That was the limit of what OpenGL API exposed.Then,when came new API with shaders and it had become possible to render as many lights as amount of instruction your shader model supported and of course how much your hardware was capable to process at acceptable frame rate.With SM4 and SM5 you can really get away with huge number of lights.I can’t say exactly how much but there is enough room for 100 for sure.With this advance we are till stuck with the second problem – the performance.With forward rendering you would process lights in a loop per object draw call.This way,if you have 100 primitives to draw with 100 lights located in the scene, you would loop over each rendered object 100 times per fragment,probably in the fragment shader if you care for quality.It’s enormous overhead and just after 20-30 lights you will start noticing the performance drop.Deferred renderer solves this issue in the following way:it breaks the render loop into 2 major passes.1)The geometry is drawn into custom Frame Buffer (sometimes called G-Buffer).In that pass geometry info like position,normals,texture coordinates and tangents are stored into texture render targets as this data will be used in the second pass.2)Second pass,executed in screen space, uses the textures from the previous one as inputs.In this pass the geometry info of the whole scene is extracted from the textures and used by lightning algorithms to shade the pixels.This way we compute lights for all the scene just once per render loop saving many precious GPU cycles.There are some drawbacks using deferred rendering, such as MSAA , transparency and more.Most of these are solvable with more sophisticated algorithms(Light pre-pass rendering is one of those).Here,just for the sake of proof of concept I used as my reference OpenGL SuperBible 6th edition’s Deferred Renderer demo.I picked it as it showcases a fresh and compact approach which is possible with GL4.2 API.For example,traditionally G-Buffer would use at least 3 color attachments to meet the needs for all the geometry data.But in in GLSL 420 numeric packing.unpacking was introduced which allows us to “squeeze” several numbers into one.This trick can save us additional texture attachments.Later you’ll see how we pack position,color,normals,uvs.

I am not going to explain line by line.You can read it in greater detail in the book.I will comment on some critical parts only.

Application setup:

We begin from the application setup.There are a couple of dependencies you should take care of.These are OpenGL context creation and Image loading.For the first one see my previous posts where I explain how to configure GLFW(I used it also in this demo).Now for the second – image loading.OGLPlus contains methods for  PNG image loading.But it expects the user has libpng linked.So you should have libpng and zlib on your machine and configure it the same way you would do with context creation library.To use this demo you must configure libpng as I use it for textures loading.And of course you will need GLEW as well.


      #include
      #include <GL/glew.h>
      #include "GL/glfw.h"
      #include <oglplus/all.hpp>
      #include "DeferredRenderer.h"
      using namespace oglplus;

  int main(int argc, char* argv[])
  {

	/// init window
	if(!glfwInit()){
		throw ;
	}
	glfwOpenWindowHint(GLFW_OPENGL_VERSION_MAJOR, 4);
	glfwOpenWindowHint(GLFW_OPENGL_VERSION_MINOR, 2);
	glfwOpenWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_COMPAT_PROFILE);
	glfwOpenWindowHint(GLFW_FSAA_SAMPLES,0);
	if(!glfwOpenWindow(764,468,8,8,8,8,24,8,GLFW_WINDOW)){

		throw;
	}

	glfwSetWindowTitle("Deferred Renderer" );
	glfwSetWindowPos(900, 300);
	glfwSwapInterval(1);

	/// init context
	if(glewInit() != GLEW_OK)
	{
		glGetError();

		return 0;
	}

	DeferredRenderer *deferredTest = new DeferredRenderer(764,468);

	Context gl;

	try{

		while(true){

			deferredTest->Render();

			if(glfwGetKey(GLFW_KEY_ESC)||false == glfwGetWindowParam(GLFW_OPENED)){

				break;

			}
			glfwSwapBuffers();

		}

	}catch(oglplus::Error& err)
	{
		std::cerr <<
			"Error (in " << err.GLSymbol() << ", " <<
			err.ClassName() << ": '" <<
			err.ObjectDescription() << "'): " <<
			err.what() <<
			" [" << err.File() << ":" << err.Line() << "] ";
		std::cerr << std::endl;
		err.Cleanup();
	}

	delete deferredTest;

	glfwTerminate();
	exit(EXIT_SUCCESS);
	return 0;
}

This is the main entrance point for the application.All it does is creating OpenGL context,init GLEW and spawn DeferredRenderer object which is then called in the rendering loop.

Deferred Renderer:

DeferredRenderer class makes use of two utilities I wrote for the sake of convenience:

ShadersInline.h – contains all the shaders as strings.
OGLPlane.h – is a class wrapping plane geometry with simple interface for rendering and transformation of plane geometry.We will used for drawing our scene geometry as well as for the full screen quad.

DeferredRenderer.h


#pragma once
#ifndef SAS_DEFERRED_RENDERER_H
#define SAS_DEFERRED_RENDERER_H

#include <oglplus/gl.hpp>
#include <oglplus/all.hpp>
#include <oglplus/bound/texture.hpp>
#include <oglplus/bound/framebuffer.hpp>

#include "OGLPlane.h"
namespace oglplus{
	class DeferredRenderer{

	public:

		DeferredRenderer(int width , int height);

		void Render();
		~DeferredRenderer(void);

	private:

	inline	float RandomFloat(float min, float max)
		{
			float r = (float)rand() / (float)RAND_MAX;
			return min + r * (max - min);
		}

		Context gl;
		AutoBind _gfboTex0;
		AutoBind _gfboTex1;
		AutoBind _gfboTexDepth;
		AutoBind _floorTex;

		AutoBind_gfbo2;

		////////Shapes   ////////////

		//Floor rendering plane :

	   OGLPlane *_floorPlane;
	   OGLPlane *_screenQuad;

		/////////  Buffers   ////////////
	   Buffer _lightUBO;

		////////// Shaders  ////////////////
		VertexShader   _geomPassVertShader;
		FragmentShader _geomPassFragShader;

		VertexShader   _resolvePassVertShader;
		FragmentShader _resolvePassFragShader;

		Program _geomProg,_resolveProg;

		////////////   Math  /////////////////

		LazyUniform _projection_matrixUniform, _camera_matrixUniform ;

		GLint _viewportW;
		GLint _viewportH;

#pragma pack (push, 1)
		struct light_t
		{
			Vec3f         position;
			unsigned int        : 32;       // pad0
			Vec3f         color;
			unsigned int        : 32;       // pad1
		};
#pragma pack (pop)

	};
}

#endif

DeferredRenderer interface is pretty simple.We declare FrameBuffer and its attachments,vertex/fragment shaders and their respective programs.We also declare pointer to two OGLPlane objects which will be instantiated dynamically in the class body.At the bottom, struct light_t is used to fetch multiple lights data into uniform buffer object (UBO).Note the padding.It’s needed in this case as the buffer uses std140 layout in GLSL which enforces some padding rules for different data types(See OpenGL SuperBible 6 for more details)

DeferredRenderer.cpp

Now let’s go step by step over DeferredRenderer.cpp.I will try to explain all the major parts.

First we initiate the constructor with defaults:


using namespace oglplus;
DeferredRenderer::DeferredRenderer(int width , int height)
	:_viewportW(width),_viewportH(height),

	_gfboTex0(Texture::Target::_2D,0),
	_gfboTex1(Texture::Target::_2D,0),
	_floorTex(Texture::Target::_2D,0),
	_gfboTexDepth(Texture::Target::_2D,0),
	_gfbo2(Framebuffer::Target::Draw),
	_projection_matrixUniform(_geomProg,"ProjectionMatrix"),
	_camera_matrixUniform(_geomProg,"CameraMatrix")

{ ...

The constructor accepts viewport width and height as params and initiates the constructors of the textures ,g-buffer and matrix uniforms.

Next we setup our G-buffer:


	_gfbo2.Bind();
	// Tex 0:
	//_gfboTex0.Image2D(0,PixelDataInternalFormat::RGBA32UI , _viewportW ,_viewportH , 0 , PixelDataFormat::RGBAInteger,PixelDataType::UnsignedInt,nullptr);
	_gfboTex0.Storage2D(1,PixelDataInternalFormat::RGBA32UI , _viewportW ,_viewportH  );
	GLuint tid = Expose(_gfboTex0).Name();
	assert(tid);
	_gfboTex0.MinFilter(TextureMinFilter::Nearest);
	_gfboTex0.MagFilter(TextureMagFilter::Nearest);
	_gfboTex0.WrapS(TextureWrap::Repeat);
	_gfboTex0.WrapT(TextureWrap::Repeat);

	//Tex 1:
	//	_gfboTex1.Image2D(0,PixelDataInternalFormat::RGBA32F , _viewportW ,_viewportH , 0 , PixelDataFormat::RGBA,PixelDataType::Float,nullptr);
	_gfboTex1.Storage2D(1,PixelDataInternalFormat::RGBA32F , _viewportW ,_viewportH );

	assert( Expose(_gfboTex1).Name());
	_gfboTex1.MinFilter(TextureMinFilter::Nearest);
	_gfboTex1.MagFilter(TextureMagFilter::Nearest);
	_gfboTex1.WrapS(TextureWrap::Repeat);
	_gfboTex1.WrapT(TextureWrap::Repeat);

	//Depth:

	//	_gfboTexDepth.Image2D(0,PixelDataInternalFormat::DepthComponent32F , _viewportW ,_viewportH , 0 , PixelDataFormat::DepthComponent,PixelDataType::Float,nullptr);
	_gfboTexDepth.Storage2D(1,PixelDataInternalFormat::DepthComponent32F ,  _viewportW ,_viewportH  );
	assert( Expose(_gfboTexDepth).Name());
	_gfboTexDepth.MinFilter(TextureMinFilter::Nearest);
	_gfboTexDepth.MagFilter(TextureMagFilter::Nearest);

	/// Init GBUFFER:

	_gfbo2.AttachTexture(FramebufferAttachment::Color,_gfboTex0,0);
	_gfbo2.AttachTexture(FramebufferAttachment::Color1,_gfboTex1,0);
	_gfbo2.AttachTexture(FramebufferAttachment::Depth,_gfboTexDepth,0);

	assert(_gfbo2.IsComplete());
	_gfbo2.Unbind(Framebuffer::Target::Draw);

G-Buffer frame buffer has 3 texture attachments – 2 color and one for depth buffer.Pay attention on internal formats.First attachment uses RGBA32UI which means each component is 32bit(4bytes) unsigned integer.This format allows us packing.We will pack color(albedo),normals and material_id(not used in this demo as I use only a single mesh object) into single RGBA32UI component by compressing 2 16 bit numbers into one 32 bit for each property except material_id which occupies a whole channel.
See gbuffer_pass_frag shader in ShaderInline.h how it done.Another detail is the usage of Storage2D.I intentionally leaved old-school Image2D commented to depict 2 ways of texture init.Storage2D wraps glTexStorage2D, which is much shorter than glTexImage2D but you must also be warned that using glTexStorage makes the texture’s format and dimensions immutable.That means,if you plan resizing the texture in runtime then don’t use glTexStorage.

Next we setup lights UBO.It will contain info for each light in the scene wrapped in the light_t struct:


  
    _lightUBO.Bind(Buffer::Target::Uniform);
	Buffer::Data(Buffer::Target::Uniform,  NUM_LIGHTS ,(light_t*)0,BufferUsage::DynamicDraw);
	
	BufferRawMap buffMap(Buffer::Target::Uniform,0,NUM_LIGHTS,BufferMapAccess::Write|BufferMapAccess::InvalidateBuffer);


	light_t * lights = reinterpret_cast<light_t *>(buffMap.RawData());
	assert(lights);
	for (int i = 0; i < NUM_LIGHTS; i++)
	{
		float i_f = ((float)i - 7.5f) * 0.1f + 0.3f;
		float rX = RandomFloat(-250.0f,250.0f);  
		float rY = RandomFloat(-250.0f,250.0f); 
		
		lights[i].position =Vec3f(rX,rY,-750.0f);
		
		lights[i].color =
			Vec3f(cosf(i_f * 14.0f) * 0.5f + 0.8f,
			sinf(i_f * 17.0f) * 0.5f + 0.8f,
			sinf(i_f * 13.0f) * cosf(i_f * 19.0f) * 0.5f + 0.8f);


	}
	buffMap.Unmap();
	_lightUBO.Unbind(Buffer::Target::Uniform);



Essentially,UBO allows us to pass an array into GLSL.The same can be achieved with 1 dimensional textures.I have never benchmarked one against other but I found UBO easier to use and update. Here we initiate the buffer first then we map it to pointer in order to access it’s content.In for..loop we fill light_t struct for each of the lights with light position and color.All this data will be access in the second pass during phong shading computation. Now we are done with g-Buffer and lights UBO.Next step is shader loading:

    


	_geomPassVertShader.Source(gbuffer_pass_vert);
	_geomPassFragShader.Source(gbuffer_pass_frag);
	_resolvePassVertShader.Source(light_pass_vert);
	_resolvePassFragShader.Source(light_pass_frag);

	try{
		_geomPassVertShader.Compile();
		_geomPassFragShader.Compile();

		_geomProg.AttachShader(_geomPassVertShader);
		_geomProg.AttachShader(_geomPassFragShader);
		_geomProg.Link();

		_resolvePassVertShader.Compile();
		_resolvePassFragShader.Compile();
		_resolveProg.AttachShader(_resolvePassVertShader);
		_resolveProg.AttachShader(_resolvePassFragShader);
		_resolveProg.Link();

	}catch(Error &er){
		throw er;
	}


All the shader strings are located in ShadersInline.h .I won’t put them out here as it’s too much code.Basically we have got 4 shaders, 2 for each program.Th first program (called _geomProg) executes the first pass and the second,as you already guessed,executes the second one. Next comes geometry.We create one big plane (500×500) to render as primitive in a world space and another unit plane which is served to render into screen space in second pass.

 
    _floorPlane = new OGLPlane(500,500, _geomProg);
    _floorPlane->Init();
	_screenQuad = new OGLPlane(1,1,_resolveProg);
	_screenQuad->Init();

Both planes are supplied with their respective programs(programs with which they are used).I don’t like this kind of coupling but OGLPlus API constructors for Uniforms require it.

Last thing left is to load a texture for our plane primitive:

    auto   pngImage  =images::PNGImage("demo1.png");

	assert(pngImage.Height() > 0) && pngImage.Width() > 0);//assert the texture dims are valid

	_floorTex.Storage2D(10,PixelDataInternalFormat::RGB8,	pngImage.Width(),	pngImage.Height());
	_floorTex.SubImage2D(0,0,0,pngImage.Width(),	pngImage.Height(),PixelDataFormat::RGB ,PixelDataType::UnsignedByte,pngImage.RawData());
	_floorTex.MinFilter(TextureMinFilter::LinearMipmapLinear);
	_floorTex.MagFilter(TextureMagFilter::Linear);
	_floorTex.WrapS(TextureWrap::ClampToEdge);
	_floorTex.WrapT(TextureWrap::ClampToEdge);
	_floorTex.GenerateMipmap();
	_floorTex.Active(0);

You can load any other texture but beware of number of channels.Here I used RGB as the texture has no alpha channel.If you load PNG 32 don’t forget to change texture internal format to RGBA8 and PicelDataFormat to RGBA.

Now let’s put it all together in a render loop:


float rot = 0.0f ;
void DeferredRenderer::Render(){

	static const GLuint uint_zeros[4] = { 0, 0, 0,0 };
	static const GLfloat float_zeros[4] = { 0.0f, 0.0f, 0.0f, 0.0f };
	static const GLfloat float_ones[4] = { 1.0f, 1.0f,1.0f, 1.0f };

	  Context::ColorBuffer draw_buffs[2] = {
		FramebufferColorAttachment::_0,
		FramebufferColorAttachment::_1
	};

	//////  draw geom in world space:

	Bind(_gfbo2,	Framebuffer::Target::Draw);
	gl.DrawBuffers(draw_buffs);
	gl.Viewport(_viewportW,_viewportH);
	gl.ClearColorBuffer(0,uint_zeros);
	gl.ClearColorBuffer(1,float_zeros);
	gl.ClearDepthBuffer(float_ones[0]);

	_geomProg.Use();
	_projection_matrixUniform.Set(CamMatrixf::PerspectiveY(Degrees(45.0f),(float) _viewportW /(float) _viewportH,0.1f,10000));
	_camera_matrixUniform.Set(CamMatrixf::LookingAt(Vec3f(0.0f),Vec3f(0.0f,0.0f,-1.0f),Vec3f(0.0f,1.0f,0.0f)));

	_floorPlane->Rotate(90.0f,rot+=1.5f,0.0f);
	_floorPlane->Translate(0.0f,0.0f,-850.0f);

	Texture::Active(0);
	Bind(_floorTex,Texture::Target::_2D);

	gl.Enable(Capability::DepthTest);
	gl.DepthFunc(CompareFunction::LEqual);

	_floorPlane->Draw();

	///--------------------   resolve to screen quad -------------------------------------///

	_resolveProg.Use();

	//Bind default FrameBuffer
	Framebuffer::BindDefault(Framebuffer::Target::Draw);
	gl.Viewport(_viewportW,_viewportH);
	gl.DrawBuffer(ColorBuffer::Back);

	//Bind first GBUFFER attachment to a sampler:
	Texture::Active(0);
	Bind(_gfboTex0,Texture::Target::_2D);
	//Bind second GBUFFER attachment to a sampler:
	Texture::Active(1);
	Bind(_gfboTex1,Texture::Target::_2D);

	gl.Disable(Capability::DepthTest);

	_lightUBO.BindBase(Buffer::IndexedTarget::Uniform, 0);
	BufferRawMap buffMap(Buffer::Target::Uniform,0,NUM_LIGHTS,BufferMapAccess::Write|BufferMapAccess::InvalidateBuffer);


	light_t * lights = reinterpret_cast<light_t *>(buffMap.RawData());
	for (int i = 0; i < NUM_LIGHTS; i++)
	{
		float i_f = ((float)i - 7.5f) * 0.1f + 0.3f;
		// t = 0.0f;
		float rX = RandomFloat(-250.0f,250.0f);  
		float rY = RandomFloat(-250.0f,250.0f); 

		lights[i].position =Vec3f(rX,rY,-800.0f);

		lights[i].color =
			Vec3f(cosf(i_f * 14.0f) * 0.5f + 0.8f,
			sinf(i_f * 17.0f) * 0.5f + 0.8f,
			sinf(i_f * 13.0f) * cosf(i_f * 19.0f) * 0.5f + 0.8f);


	}
	buffMap.Unmap();
    _screenQuad->Draw();

	_lightUBO.UnbindBase(Buffer::IndexedTarget::Uniform, 0);

	Texture::Unbind(Texture::Target::_2D);

}

The render loop is divided into 2 stages.First we render scene geometry normally into G-Buffer.During this stage G-BUffer attachments are filled with the data regarding rendered geometry.In the second pass the scene geometry is replaced with a simple unit plane screen quad.The quad takes the whole screen so we can make a so called post-processing pass.In this pass the data from the previous stage helps to “reconstruct” the geometry properties in screen space and calculate the scene lightning correctly.During the second pass we also update our lights UBO.You don’t have to do it if the lights in your scene are static(in this case you ca use just light maps and forget about this tutorial ;) ) .A free tip – don’t forget to unmap the UBO after updating as the access to it from the shader is blocked as long as it stays mapped to CPU.

That’s all folks.Here you can see the result and download the sources.And don’t forget to visit OGLPlus.org and grab the API

The code.

Result:

, ,

Leave a Reply

Trackbacks:0

Listed below are links to weblogs that reference
OGLPlus tutorial:Deferred Renderer from OnlyGraphics
TOP