| Home | Register | Members | Search | Links |
![]() |
| Thread Tools | Display Modes |
|
Skybuck Flying
Guest
Posts: n/a
|
Hello, GPGPU example plus needed OpenGL and CG libraries for Delphi 2007 now available ! I created the necessary OpenGL and CG (delphi) packages yesterday and finished converting/translating/porting a C GPGPU OpenGL/CG example to Delphi today and I would like to share it with you so that maybe some more (delphi) people will look into gpgpu programming, especially with opengl 3.0 for (older) dx9 graphics cards. Maybe somebody will create something interesting some day ! ![]() The C example is a bit wacky... (so the Delphi example is also a bit wacky ) but it's a nice demonstration of the GPU's processing power and a niceintroduction to it. (The Delphi example gets rid of glut... I don't like glut and it's not needed, saves me from having to use an extra library )I created a WinRar file (compression utility) to store the packages source code and the example source code (Delphi only) into it... with bpl's and exe's too. I simply moved the packages and the source code from my development folders to a "distribution" folder but I think it will work if you just install the packages first into delphi and then open the example. (Let me know if it doesn't work for you or need help with that... then I will look into it... but it should work ! )I put the WinRar file (GPGPUonDelphi2007.rar) on my "skydrive": Link to skydrive/rar file: http://cid-aedd0ea32d61bc86.skydrive...Delphi2007.rar Link to skydrive delphi->downloads folder: http://cid-aedd0ea32d61bc86.skydrive...lphi/Downloads Link to skydrive: http://cid-aedd0ea32d61bc86.skydrive...se.aspx/Delphi Link to C source code: http://www.mathematik.uni-dortmund.d...u/saxpy_cg.cpp Link to tutorial/explanations: http://www.mathematik.uni-dortmund.d.../tutorial.html I hope you enjoy the example, have fun with it ! Bye, Skybuck. |
|
|
|
|||
|
Skybuck Flying
Guest
Posts: n/a
|
I shall also post the example source code here in case the sky drive ever
goes down. (It uses "create rendering context" helper function from a dglOpenGL library) (If you can't find these libraries in the future you might have to create your own rendering context functions etc... or look for example on the net... so far so good ! )It's not too large so it might work: // *** Begin of Example *** { GPGPU Basic Math Tutorial www.mathematik.uni-dortmund.de/~goeddeke/gpgpu Please drop me a note if you encounter any bugs, or if you have suggestions on how to improve this tutorial: } { GPGPU Basic Math Tutorial version 0.01 created on 3 september 2009 by Skybuck Flying + C code converted to Delphi 2007 code. } { version 0.02 created on 3 september 2009 by Skybuck Flying + Extra code added to make it working like: + InitializeOpenGL + CleanUpOpenGL + GetConsoleWindowHandle function. It's working now and it's nice ! =D Parameter examples: // testing format ProramName.exe 0 rect_arb_rgba_32 1 100 10 // big problem size to show off speed of gpu. ProramName.exe 1 rect_arb_rgba_32 1 1000000 10 // displaying vectors for cpu vs gpu precision comparision. ProramName.exe 1 rect_arb_rgba_32 2 100 10 Conclusions: For small problem size cpu faster. (AMD X2 3800+ Dual Core) For large problem size gpu faster. (XFX NVIDIA GTX 7900 512 MB ram) } program GPGPUBasicMathTutorial; {$APPTYPE CONSOLE} uses SysUtils, Windows, unit_opengl_api_version_002, unit_cg_api_version_002, unit_cg_gl_api_version_002; { // not needed for Delphi and replaced with uses clausule above ![]() // includes #include <stdio.h> #include <stdlib.h> #include <string.h> #include <math.h> #include <time.h> #include <GL/glew.h> #include <GL/glut.h> #include <Cg/cgGL.h> } // error codes const ERROR_CG : integer = -1; ERROR_GLEW : integer = -2; ERROR_TEXTURE : integer = -3; ERROR_BINDFBO : integer = -4; ERROR_FBOTEXTURE : integer = -5; ERROR_PARAMS : integer = -6; // prototypes function GetConsoleWindowHandle : HWND; forward; procedure cgErrorCallback(); cdecl; forward; function checkFramebufferStatus() : boolean; forward; procedure checkGLErrors(const ParaLabel : Pchar); forward; procedure compareResults(); forward; procedure createTextures(); forward; procedure createAllTextureParameters(); forward; procedure initCG(); forward; procedure initFBO(); forward; procedure InitializeOpenGL(); forward; procedure CleanUpOpenGL(); forward; procedure performComputation(); forward; procedure printVector(const p : array of single; const N : integer); forward; procedure setupTexture(const texID : GLuint); forward; procedure swap(); forward; procedure transferFromTexture(data : Psingle); forward; procedure transferToTexture(data : Psingle; texID : GLuint); forward; type // struct for variable parts of GL calls (texture format, float format etc) struct_textureParameters = record name : Pchar; texTarget : GLenum; texInternalFormat : GLenum; texFormat : GLenum; shader_source : Pchar; end; var // mode: 0=test (POT), 1=bench (set from command line) mode : integer; // problem size, texture size, number of iterations (set from command line) ProblemSize : integer; texSize : integer; numIterations : integer; // flags to fine-tune application and to ease debugging mode_showResults : boolean = true; mode_compareResults : boolean = true; // texture identifiers yTexID : array[0..1] of GLuint; xTexID : GLuint; // aTexID : GLuint; // never used. // ping pong management vars writeTex : integer = 0; readTex : integer = 1; attachmentpoints : array[0..1] of GLenum = ( GL_COLOR_ATTACHMENT0_EXT, GL_COLOR_ATTACHMENT1_EXT ); // Cg vars cgContext : PCGcontext; fragmentProfile : CGprofile; fragmentProgram : CGprogram ; yParam, xParam, alphaParam : CGparameter; // FBO identifier fb : GLuint; // timing vars vFrequency : int64; vStartTick : int64; vStopTick : int64; // handle to offscreen "window", only used to properly shut down the app // glutWindowHandle : GLuint; // never used. rect_arb_rgba_32, // texture rectangles, texture_float_ARB, RGBA, 32 bits rect_arb_rgba_16, // texture rectangles, texture_float_ARB, RGBA, 16 bits rect_arb_r_32, // texture rectangles, texture_float_ARB, R, 32 bits rect_ati_rgba_32, // texture rectangles, ATI_texture_float, RGBA, 32 bits rect_ati_rgba_16, // texture rectangles, ATI_texture_float, RGBA, 16 bits rect_ati_r_32, // texture rectangles, ATI_texture_float, R, 32 bits rect_nv_rgba_32, // texture rectangles, NV_float_buffer, RGBA, 32 bits rect_nv_rgba_16, // texture rectangles, NV_float_buffer, RGBA, 16 bits rect_nv_r_32, // texture rectangles, NV_float_buffer, R, 32 bits twod_arb_rgba_32, // texture 2ds, texture_float_ARB, RGBA, 32 bits twod_arb_rgba_16, // texture 2ds, texture_float_ARB, RGBA, 16 bits twod_arb_r_32, // texture 2ds, texture_float_ARB, R, 32 bits twod_ati_rgba_32, // texture 2ds, ATI_texture_float, RGBA, 32 bits twod_ati_rgba_16, // texture 2ds, ATI_texture_float, RGBA, 16 bits twod_ati_r_32, // texture 2ds, ATI_texture_float, R, 32 bits twod_nv_rgba_32, // texture 2ds, NV_float_buffer, RGBA, 32 bits twod_nv_rgba_16, // texture 2ds, NV_float_buffer, RGBA, 16 bits twod_nv_r_32 : struct_textureParameters; // texture 2ds, NV_float_buffer, R, 32 bits // struct actually being used (set from command line) textureParameters : struct_textureParameters; // actual data dataX : array of single; dataY : array of single; alpha : single; // variables for opengl initialization and cleanup. vConsoleWindowHandle : HWND; vDevice_ContextHandle : HDC; vGL_RenderingContextHandle : HGLRC; // helper function function GetConsoleWindowHandle : HWND; const vMaximumTitleSize = 1024; var vOldWindowTitle : AnsiString; vNewWindowTitle : AnsiString; begin // allocate space for old window title. SetLength( vOldWindowTitle, vMaximumTitleSize ); // fetch current window title. GetConsoleTitle( PAnsiChar(vOldWindowTitle), vMaximumTitleSize); // format a "unique" NewWindowTitle. vNewWindowTitle := IntToStr(GetTickCount()) + IntToStr(GetCurrentProcessId()); // change current window title. SetConsoleTitle(PAnsiChar(vNewWindowTitle)); repeat // might cause some high cpu usage here... but on fast pc's it seems to // not do that... don't know about w95 so test it there sometime or so ![]() // otherwise add a sleep(40) here or so ![]() // but for now I let it be... prevents title flicker ! ![]() // Look for NewWindowTitle. result := FindWindow( nil, PAnsiChar(vNewWindowTitle)); until result <> 0; // Restore original window title. SetConsoleTitle(PAnsiChar(vOldWindowTitle)); // free space for old window title vOldWindowTitle := ''; end; // Callback for Cg errors procedure cgErrorCallback(); cdecl; var lastError : CGerror; begin lastError := cgGetError(); if (lastError <> 0) then begin writeln(cgGetErrorString(lastError)); writeln(cgGetLastListing(cgContext)); writeln('press enter to exit'); readln; halt(ERROR_CG); end; end; // Sets up a floating point texture with NEAREST filtering. // (mipmaps etc. are unsupported for floating point textures) procedure setupTexture (const texID : GLuint); begin // make active and bind glBindTexture(textureParameters.texTarget,texID); // turn off filtering and wrap modes glTexParameteri(textureParameters.texTarget, GL_TEXTURE_MIN_FILTER, GL_NEAREST); glTexParameteri(textureParameters.texTarget, GL_TEXTURE_MAG_FILTER, GL_NEAREST); glTexParameteri(textureParameters.texTarget, GL_TEXTURE_WRAP_S, GL_CLAMP); glTexParameteri(textureParameters.texTarget, GL_TEXTURE_WRAP_T, GL_CLAMP); // define texture with floating point format glTexImage2D(textureParameters.texTarget,0,texture Parameters.texInternalFormat,texSize,texSize,0,tex tureParameters.texFormat,GL_FLOAT,nil); // check if that worked if (glGetError() <> GL_NO_ERROR) then begin writeln('glTexImage2D(): [FAIL]'); writeln('press enter to exit'); readln; halt (ERROR_TEXTURE); end else if (mode = 0) then begin writeln('glTexImage2D(): [PASS]'); end;end;// Transfers data from currently texture, and stores it in given array.procedure transferFromTexture( data : Psingle );begin // version (a): texture is attached // recommended on both NVIDIA and ATI glReadBuffer(attachmentpoints[readTex]); glReadPixels(0, 0, texSize,texSize,textureParameters.texFormat,GL_FLO AT,data); // version b: texture is not neccessarily attached// glBindTexture(textureParameters.texTarget,yTexID[readTex]);//glGetTexImage(textureParameters.texTarget,0,textur eParameters.texFormat,GL_FLOAT,data);end;// Transfers data to texture.// Check web page for detailed explanation on the difference between ATI andNVIDIA.procedure transferToTexture (data : Psingle; texID : GLuint);begin // version (a): HW-accelerated on NVIDIA glBindTexture(textureParameters.texTarget, texID); glTexSubImage2D(textureParameters.texTarget,0,0,0, texSize,texSize,textureParameters.texFormat,GL_FLO AT,data); // version (b): HW-accelerated on ATI// glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT,textureParameters.texTarg et, texID, 0);// glDrawBuffer(GL_COLOR_ATTACHMENT0_EXT);// glRasterPos2i(0,0);// glDrawPixels(texSize,texSize,textureParameters.tex Format,GL_FLOAT,data);// glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT,textureParameters.texTarg et, 0, 0);end;// creates textures, sets proper viewport etc.procedure createTextures();begin // create textures // y gets two textures, alternatingly read-only and write-only, // x is just read-only glGenTextures (2, @yTexID); glGenTextures (1, @xTexID); // set up textures setupTexture (yTexID[readTex]); transferToTexture(Psingle(dataY),yTexID[readTex]); setupTexture (yTexID[writeTex]); transferToTexture(Psingle(dataY),yTexID[writeTex]); setupTexture (xTexID); transferToTexture(Psingle(dataX),xTexID); // set texenv mode from modulate (the default) to replace) glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_REPLACE); // check if something went completely wrong checkGLErrors ('createFBOandTextures()');end;procedure InitializeOpenGL();begin vConsoleWindowHandle := GetConsoleWindowHandle; if vConsoleWindowHandle <> 0 then begin vDevice_ContextHandle := GetDC( vConsoleWindowHandle ); if vDevice_ContextHandle <> 0 then begin writeln('CreateDC successfull.'); vGL_RenderingContextHandle := CreateRenderingContext(vDevice_ContextHandle, [opDoubleBuffered], 32, 16, 16, 16, 1, 1 ); if vGL_RenderingContextHandle <> 0 then begin writeln('CreateRenderingContext successfull.'); end else begin writeln('CreateRenderingContext failed.'); end; ActivateRenderingContext( vDevice_ContextHandle,vGL_RenderingContextHandle, True ); end else begin writeln('GetDC failed.'); writeln(GetLastError); end; end else begin writeln('GetConsoleWindowHandle failed.'); end;end;procedure CleanUpOpenGL();begin DeactivateRenderingContext; DestroyRenderingContext( vGL_RenderingContextHandle ); if ReleaseDC( vConsoleWindowHandle, vDevice_ContextHandle ) <> 0 then begin writeln('ReleaseDC successfull.'); end else begin writeln('ReleaseDC failed.'); end;end;// Creates framebuffer object, binds it to reroute rendering operations// from the traditional framebuffer to the offscreen bufferprocedure initFBO();begin // create FBO (off-screen framebuffer) glGenFramebuffersEXT(1, @fb); // bind offscreen framebuffer (that is, skip the window-specific rendertarget) glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, fb); // viewport for 1:1 pixel=texture mapping glMatrixMode(GL_PROJECTION); glLoadIdentity(); gluOrtho2D(0.0, texSize, 0.0, texSize); glMatrixMode(GL_MODELVIEW); glLoadIdentity(); glViewport(0, 0, texSize, texSize);end;// Sets up the Cg runtime and creates shader.procedure initCG();begin // set up Cg cgSetErrorCallback(cgErrorCallback); cgContext := cgCreateContext(); fragmentProfile := cgGLGetLatestProfile(CG_GL_FRAGMENT); cgGLSetOptimalOptions(fragmentProfile); // create fragment program fragmentProgram := cgCreateProgram (cgContext, CG_SOURCE,textureParameters.shader_source, fragmentProfile, 'saxpy', nil); // load program cgGLLoadProgram (fragmentProgram); // and get parameter handles by name yParam := cgGetNamedParameter (fragmentProgram, 'textureY'); xParam := cgGetNamedParameter (fragmentProgram, 'textureX'); alphaParam := cgGetNamedParameter (fragmentProgram, 'alpha');end;// Performs and times saxpy on the CPU, compares resultsprocedure compareResults();var data : array of single; vTotal : double; vMflops : double; vMaxError : double; vAvgError : double; vDiff : double; vIndexN : integer; vIndexI : integer;begin // get GPU results SetLength( data, ProblemSize ); transferFromTexture (Psingle(data)); if (mode_compareResults) then begin // calc on CPU QueryPerformanceCounter(vStartTick); for vIndexN := 0 to numIterations-1 do begin for vIndexI := 0 to ProblemSize-1 do begin dataY[vIndexI] := dataY[vIndexI] + alpha*dataX[vIndexI]; end; end; QueryPerformanceCounter(vStopTick); vTotal := (vStopTick-vStartTick) / vFrequency; vMflops := (2.0*ProblemSize*numIterations) / (vTotal * 1000000.0); writeln('CPU MFLOP/s: ', Round( vMflops ) ); // and compare results vMaxError := -1000.0; vAvgError := 0.0; for vIndexI := 0 to ProblemSize-1 do begin vDiff := abs(data[vIndexI]-dataY[vIndexI]); if (vDiff > vMaxError) then begin vMaxError := vDiff; end; vAvgError := vAvgError + vDiff; end; vAvgError := vAvgError / ProblemSize; writeln('Max Error: ', vMaxError:16:16); writeln('Avg Error: ', vAvgError:16:16); if (mode_showResults) then begin writeln('CPU RESULTS:'); printVector(dataY,ProblemSize); end; end; if (mode_showResults) then begin // print out results writeln('GPU RESULTS:'); printVector (data,ProblemSize); end; data := nil;end;// Performs the actual calculation.procedure performComputation();var vIndexI : integer; vTotal : double; vMflops : double;begin // attach two textures to FBO glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, attachmentpoints[writeTex],textureParameters.texTarget, yTexID[writeTex], 0); glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, attachmentpoints[readTex],textureParameters.texTarget, yTexID[readTex], 0); // check if that worked if (not checkFramebufferStatus()) then begin writeln('glFramebufferTexture2DEXT(): [FAIL]'); writeln('press enter to exit'); readln; halt(ERROR_FBOTEXTURE); end else if (mode = 0) then begin writeln('glFramebufferTexture2DEXT(): [PASS]'); end; // enable fragment profile cgGLEnableProfile(fragmentProfile); // bind saxpy program cgGLBindProgram(fragmentProgram); // enable texture x (read-only, not changed during the iteration) cgGLSetTextureParameter(xParam, xTexID); cgGLEnableTextureParameter(xParam); // enable scalar alpha (same) cgSetParameter1f(alphaParam, alpha); // Calling glFinish() is only neccessary to get accurate timings, // and we need a high number of iterations to avoid timing noise. glFinish(); QueryPerformanceCounter(vStartTick); for vIndexI := 0 to numIterations-1 do begin // set render destination glDrawBuffer (attachmentpoints[writeTex]); // enable texture y_old (read-only) cgGLSetTextureParameter(yParam, yTexID[readTex]); cgGLEnableTextureParameter(yParam); // and render multitextured viewport-sized quad // depending on the texture target, switch between // normalised ([0,1]^2) and unnormalised ([0,w]x[0,h]) // texture coordinates // make quad filled to hit every pixel/texel // (should be default but we never know) glPolygonMode(GL_FRONT,GL_FILL); // and render the quad if (textureParameters.texTarget = GL_TEXTURE_2D) then begin // render with normalized texcoords glBegin(GL_QUADS); glTexCoord2f(0.0, 0.0); glVertex2f(0.0, 0.0); glTexCoord2f(1.0, 0.0); glVertex2f(texSize, 0.0); glTexCoord2f(1.0, 1.0); glVertex2f(texSize, texSize); glTexCoord2f(0.0, 1.0); glVertex2f(0.0, texSize); glEnd(); end else begin // render with unnormalized texcoords glBegin(GL_QUADS); glTexCoord2f(0.0, 0.0); glVertex2f(0.0, 0.0); glTexCoord2f(texSize, 0.0); glVertex2f(texSize, 0.0); glTexCoord2f(texSize, texSize); glVertex2f(texSize, texSize); glTexCoord2f(0.0, texSize); glVertex2f(0.0, texSize); glEnd(); end; // swap role of the two textures (read-only source becomes // write-only target and the other way round): swap(); end; // done, stop timer, calc MFLOP/s if neccessary if (mode = 1) then begin glFinish(); QueryPerformanceCounter(vStopTick); vTotal := (vStopTick-vStartTick) / vFrequency; // calc mflops vMflops := (2.0*ProblemSize*numIterations) / (vTotal * 1000000.0); writeln('MFLOP/s for N=', ProblemSize, ': ', Round( vMflops )); end; // done, just do some checks if everything went smoothly. checkFramebufferStatus(); checkGLErrors('render()');end;// Sets up the various structs used to handle texture targets, textureformats etc.procedure createAllTextureParameters();begin rect_arb_rgba_32.name := 'TEXRECT - float_ARB - RGBA - 32'; rect_arb_rgba_32.texTarget := GL_TEXTURE_RECTANGLE_ARB; rect_arb_rgba_32.texInternalFormat := GL_RGBA32F_ARB; rect_arb_rgba_32.texFormat := GL_RGBA; rect_arb_rgba_32.shader_source :='float4 saxpy (' +'in float2 coords : TEXCOORD0, ' +'uniform samplerRECT textureY, ' +'uniform samplerRECT textureX, ' +'uniform float alpha ) : COLOR { ' +'float4 y = texRECT (textureY, coords); ' +'float4 x = texRECT (textureX, coords); ' +'return y+alpha*x; }'; rect_arb_rgba_16.name := 'TEXRECT - float_ARB - RGBA - 16'; rect_arb_rgba_16.texTarget := GL_TEXTURE_RECTANGLE_ARB; rect_arb_rgba_16.texInternalFormat := GL_RGBA16F_ARB; rect_arb_rgba_16.texFormat := GL_RGBA; rect_arb_rgba_16.shader_source :='half4 saxpy (' +'in float2 coords : TEXCOORD0, ' +'uniform samplerRECT textureY, ' +'uniform samplerRECT textureX, ' +'uniform float alpha ) : COLOR { ' +'half4 y = texRECT (textureY, coords); ' +'half4 x = texRECT (textureX, coords); ' +'return y+alpha*x; }'; rect_arb_r_32.name := 'TEXRECT - float_ARB - R - 32'; rect_arb_r_32.texTarget := GL_TEXTURE_RECTANGLE_ARB; rect_arb_r_32.texInternalFormat := GL_LUMINANCE32F_ARB; rect_arb_r_32.texFormat := GL_LUMINANCE; rect_arb_r_32.shader_source :='float saxpy (' +'in float2 coords : TEXCOORD0, ' +'uniform samplerRECT textureY, ' +'uniform samplerRECT textureX, ' +'uniform float alpha ) : COLOR { ' +'float y = texRECT (textureY, coords); ' +'float x = texRECT (textureX, coords); ' +'return y+alpha*x; }'; rect_ati_rgba_32.name := 'TEXRECT - float_ATI - RGBA - 32'; rect_ati_rgba_32.texTarget := GL_TEXTURE_RECTANGLE_ARB; rect_ati_rgba_32.texInternalFormat := GL_RGBA_FLOAT32_ATI; rect_ati_rgba_32.texFormat := GL_RGBA; rect_ati_rgba_32.shader_source :='float4 saxpy (' +'in float2 coords: TEXCOORD0, ' +'uniform samplerRECT textureY, ' +'uniform samplerRECT textureX, ' +'uniform float alpha ) : COLOR { ' +'float4 y = texRECT (textureY, coords); ' +'float4 x = texRECT (textureX, coords); ' +'return y+alpha*x; }'; rect_ati_rgba_16.name := 'TEXRECT - float_ATI - RGBA - 16'; rect_ati_rgba_16.texTarget := GL_TEXTURE_RECTANGLE_ARB; rect_ati_rgba_16.texInternalFormat := GL_RGBA_FLOAT16_ATI; rect_ati_rgba_16.texFormat := GL_RGBA; rect_ati_rgba_16.shader_source :='half4 saxpy (' +'in float2 coords: TEXCOORD0, ' +'uniform samplerRECT textureY, ' +'uniform samplerRECT textureX, ' +'uniform float alpha ) : COLOR { ' +'half4 y = texRECT (textureY, coords); ' +'half4 x = texRECT (textureX, coords); ' +'return y+alpha*x; }'; rect_ati_r_32.name := 'TEXRECT - float_ATI - R - 32'; rect_ati_r_32.texTarget := GL_TEXTURE_RECTANGLE_ARB; rect_ati_r_32.texInternalFormat := GL_LUMINANCE_FLOAT32_ATI; rect_ati_r_32.texFormat := GL_LUMINANCE; rect_ati_r_32.shader_source :='float saxpy (' +'in float2 coords : TEXCOORD0, ' +'uniform samplerRECT textureY, ' +'uniform samplerRECT textureX, ' +'uniform float alpha ) : COLOR { ' +'float y = texRECT (textureY, coords); ' +'float x = texRECT (textureX, coords); ' +'return y+alpha*x; }'; rect_nv_rgba_32.name := 'TEXRECT - float_NV - RGBA - 32'; rect_nv_rgba_32.texTarget := GL_TEXTURE_RECTANGLE_ARB; rect_nv_rgba_32.texInternalFormat := GL_FLOAT_RGBA32_NV; rect_nv_rgba_32.texFormat := GL_RGBA; rect_nv_rgba_32.shader_source :='float4 saxpy (' +'in float2 coords : TEXCOORD0, ' +'uniform samplerRECT textureY, ' +'uniform samplerRECT textureX, ' +'uniform float alpha ) : COLOR { ' +'float4 y = texRECT (textureY, coords); ' +'float4 x = texRECT (textureX, coords); ' +'return y+alpha*x; }'; rect_nv_rgba_16.name := 'TEXRECT - float_NV - RGBA - 16'; rect_nv_rgba_16.texTarget := GL_TEXTURE_RECTANGLE_ARB; rect_nv_rgba_16.texInternalFormat := GL_FLOAT_RGBA16_NV; rect_nv_rgba_16.texFormat := GL_RGBA; rect_nv_rgba_16.shader_source :='half4 saxpy (' +'in float2 coords : TEXCOORD0, ' +'uniform samplerRECT textureY, ' +'uniform samplerRECT textureX, ' +'uniform float alpha ) : COLOR { ' +'half4 y = texRECT (textureY, coords); ' +'half4 x = texRECT (textureX, coords); ' +'return y+alpha*x; }'; rect_nv_r_32.name := 'TEXRECT - float_NV - R - 32'; rect_nv_r_32.texTarget := GL_TEXTURE_RECTANGLE_ARB; rect_nv_r_32.texInternalFormat := GL_FLOAT_R32_NV; rect_nv_r_32.texFormat := GL_LUMINANCE; rect_nv_r_32.shader_source :='float saxpy (' +'in float2 coords : TEXCOORD0, ' +'uniform samplerRECT textureY, ' +'uniform samplerRECT textureX, ' +'uniform float alpha ) : COLOR { ' +'float y = texRECT (textureY, coords); ' +'float x = texRECT (textureX, coords); ' +'return y+alpha*x; }';///////////// twod_arb_rgba_32.name := 'tex2D - float_ARB - RGBA - 32'; twod_arb_rgba_32.texTarget := GL_TEXTURE_2D; twod_arb_rgba_32.texInternalFormat := GL_RGBA32F_ARB; twod_arb_rgba_32.texFormat := GL_RGBA; twod_arb_rgba_32.shader_source :='float4 saxpy (' +'in float2 coords: TEXCOORD0, ' +'uniform sampler2D textureY, ' +'uniform sampler2D textureX, ' +'uniform float alpha ) : COLOR { ' +'float4 y = tex2D (textureY, coords); ' +'float4 x = tex2D (textureX, coords); ' +'return y+alpha*x; }'; twod_arb_rgba_16.name := 'tex2D - float_ARB - RGBA - 16'; twod_arb_rgba_16.texTarget := GL_TEXTURE_2D; twod_arb_rgba_16.texInternalFormat := GL_RGBA16F_ARB; twod_arb_rgba_16.texFormat := GL_RGBA; twod_arb_rgba_16.shader_source :='half4 saxpy (' +'in float2 coords: TEXCOORD0, ' +'uniform sampler2D textureY, ' +'uniform sampler2D textureX, ' +'uniform float alpha ) : COLOR { ' +'half4 y = tex2D (textureY, coords); ' +'half4 x = tex2D (textureX, coords); ' +'return y+alpha*x; }'; twod_arb_r_32.name := 'tex2D - float_ARB - R - 32'; twod_arb_r_32.texTarget := GL_TEXTURE_2D; twod_arb_r_32.texInternalFormat := GL_LUMINANCE32F_ARB; twod_arb_r_32.texFormat := GL_LUMINANCE; twod_arb_r_32.shader_source :='float saxpy (' +'in float2 coords: TEXCOORD0, ' +'uniform sampler2D textureY, ' +'uniform sampler2D textureX, ' +'uniform float alpha ) : COLOR { ' +'float y = tex2D (textureY, coords); ' +'float x = tex2D (textureX, coords); ' +'return y+alpha*x; }'; twod_ati_rgba_32.name := 'tex2D - float_ATI - RGBA - 32'; twod_ati_rgba_32.texTarget := GL_TEXTURE_2D; twod_ati_rgba_32.texInternalFormat := GL_RGBA_FLOAT32_ATI; twod_ati_rgba_32.texFormat := GL_RGBA; twod_ati_rgba_32.shader_source :='float4 saxpy (' +'in float2 coords: TEXCOORD0, ' +'uniform sampler2D textureY, ' +'uniform sampler2D textureX, ' +'uniform float alpha ) : COLOR { ' +'float4 y = tex2D (textureY, coords); ' +'float4 x = tex2D (textureX, coords); ' +'return y+alpha*x; }'; twod_ati_rgba_16.name := 'tex2D - float_ATI - RGBA - 16'; twod_ati_rgba_16.texTarget := GL_TEXTURE_2D; twod_ati_rgba_16.texInternalFormat := GL_RGBA_FLOAT16_ATI; twod_ati_rgba_16.texFormat := GL_RGBA; twod_ati_rgba_16.shader_source :='half4 saxpy (' +'in float2 coords: TEXCOORD0, ' +'uniform sampler2D textureY, ' +'uniform sampler2D textureX, ' +'uniform float alpha ) : COLOR { ' +'half4 y = tex2D (textureY, coords); ' +'half4 x = tex2D (textureX, coords); ' +'return y+alpha*x; }'; twod_ati_r_32.name := 'tex2D - float_ATI - R - 32'; twod_ati_r_32.texTarget := GL_TEXTURE_2D; twod_ati_r_32.texInternalFormat := GL_LUMINANCE_FLOAT32_ATI; twod_ati_r_32.texFormat := GL_LUMINANCE; twod_ati_r_32.shader_source :='float saxpy (' +'in float2 coords: TEXCOORD0, ' +'uniform sampler2D textureY, ' +'uniform sampler2D textureX, ' +'uniform float alpha ) : COLOR { ' +'float y = tex2D (textureY, coords); ' +'float x = tex2D (textureX, coords); ' +'return y+alpha*x; }'; twod_nv_rgba_32.name := 'tex2D - float_NV - RGBA - 32'; twod_nv_rgba_32.texTarget := GL_TEXTURE_2D; twod_nv_rgba_32.texInternalFormat := GL_FLOAT_RGBA32_NV; twod_nv_rgba_32.texFormat := GL_RGBA; twod_nv_rgba_32.shader_source :='float4 saxpy (' +'in float2 coords : TEXCOORD0, ' +'uniform sampler2D textureY, ' +'uniform sampler2D textureX, ' +'uniform float alpha ) : COLOR { ' +'float4 y = tex2D (textureY, coords); ' +'float4 x = tex2D (textureX, coords); ' +'return y+alpha*x; }'; twod_nv_rgba_16.name := 'tex2D - float_NV - RGBA - 16'; twod_nv_rgba_16.texTarget := GL_TEXTURE_2D; twod_nv_rgba_16.texInternalFormat := GL_FLOAT_RGBA16_NV; twod_nv_rgba_16.texFormat := GL_RGBA; twod_nv_rgba_16.shader_source :='half4 saxpy (' +'in float2 coords : TEXCOORD0, ' +'uniform sampler2D textureY, ' +'uniform sampler2D textureX, ' +'uniform float alpha ) : COLOR { ' +'half4 y = tex2D (textureY, coords); ' +'half4 x = tex2D (textureX, coords); ' +'return y+alpha*x; }'; twod_nv_r_32.name := 'tex2D - float_NV - R - 32'; twod_nv_r_32.texTarget := GL_TEXTURE_2D; twod_nv_r_32.texInternalFormat := GL_FLOAT_R32_NV; twod_nv_r_32.texFormat := GL_LUMINANCE; twod_nv_r_32.shader_source :='float saxpy (' +'in float2 coords : TEXCOORD0, ' +'uniform sampler2D textureY, ' +'uniform sampler2D textureX, ' +'uniform float alpha ) : COLOR { ' +'float y = tex2D (textureY, coords); ' +'float x = tex2D (textureX, coords); ' +'return y+alpha*x; }';end;// Checks for OpenGL errors.// Extremely useful debugging function: When developing,// make sure to call this after almost every GL call.procedure checkGLErrors (const ParaLabel : Pchar);var errStr : Pchar; errCode : GLenum;begin errCode := glGetError(); if (errCode <> GL_NO_ERROR) then begin errStr := gluErrorString(errCode); write('OpenGL ERROR: '); write(errStr); write('(Label: '); write(ParaLabel); writeln(')'); end;end;// Checks framebuffer status.// Copied directly out of the spec, modified to deliver a return value.function checkFramebufferStatus() : boolean;var status : GLenum;begin result := false; status := glCheckFramebufferStatusEXT(GL_FRAMEBUFFER_EXT); case status of GL_FRAMEBUFFER_COMPLETE_EXT : begin result := true; end; GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT_EXT: begin writeln('Framebuffer incomplete, incomplete attachment\n'); end; GL_FRAMEBUFFER_UNSUPPORTED_EXT: begin writeln('Unsupported framebuffer format\n'); end; GL_FRAMEBUFFER_INCOMPLETE_MISSING_ATTACHMENT_EXT: begin writeln('Framebuffer incomplete, missing attachment\n'); end; GL_FRAMEBUFFER_INCOMPLETE_DIMENSIONS_EXT: begin writeln('Framebuffer incomplete, attached images must have samedimensions\n'); end; GL_FRAMEBUFFER_INCOMPLETE_FORMATS_EXT: begin writeln('Framebuffer incomplete, attached images must have sameformat\n'); end; GL_FRAMEBUFFER_INCOMPLETE_DRAW_BUFFER_EXT: begin writeln('Framebuffer incomplete, missing draw buffer\n'); end; GL_FRAMEBUFFER_INCOMPLETE_READ_BUFFER_EXT: begin writeln('Framebuffer incomplete, missing read buffer\n'); end; end;end;// Prints out given vector for debugging purposes.procedure printVector (const p : array of single; const N : integer);var vIndex : integer;begin for vIndex := 0 to N-1 do begin writeln(p[vIndex]); end;end;// swaps the role of the two y-textures (read-only and write-only)// Can be done in a smarter way :-)procedure swap();begin if (writeTex = 0) then begin writeTex := 1; readTex := 0; end else begin writeTex := 0; readTex := 1; end;end;// main, just calls things in the appropriate order//int main(int argc, char **argv) {procedure Main;var vIndex : integer;begin writeln('program started'); QueryPerformanceFrequency(vFrequency); // create variables for GL createAllTextureParameters(); // parse command line if (ParamCount < 5) then begin writeln('Command line parameters:'); writeln('Param 1: 0 = check if given format is supported'); writeln(' 1 = do some benchmarking '); writeln('Param 2: one of the test formats'); writeln('rect_arb_rgba_32, // texture rectangles, texture_float_ARB, RGBA,32 bits'); writeln('rect_arb_rgba_16, // texture rectangles, texture_float_ARB, RGBA,16 bits'); writeln('rect_arb_r_32, // texture rectangles, texture_float_ARB, R, 32bits'); writeln('rect_ati_rgba_32, // texture rectangles, ATI_texture_float, RGBA,32 bits'); writeln('rect_ati_rgba_16, // texture rectangles, ATI_texture_float, RGBA,16 bits'); writeln('rect_ati_r_32, // texture rectangles, ATI_texture_float, R, 32bits'); writeln('rect_nv_rgba_32m // texture rectangles, NV_float_buffer, RGBA,32 bits'); writeln('rect_nv_rgba_16, // texture rectangles, NV_float_buffer, RGBA,16 bits'); writeln('rect_nv_r_32, // texture rectangles, NV_float_buffer, R, 32bits'); writeln('twod_arb_rgba_32, // texture 2ds, texture_float_ARB, RGBA, 32bits'); writeln('twod_arb_rgba_16, // texture 2ds, texture_float_ARB, RGBA, 16bits'); writeln('twod_arb_r_32, // texture 2ds, texture_float_ARB, R, 32bits'); writeln('twod_ati_rgba_32, // texture 2ds, ATI_texture_float, RGBA, 32bits'); writeln('twod_ati_rgba_16, // texture 2ds, ATI_texture_float, RGBA, 16bits'); writeln('twod_ati_r_32, // texture 2ds, ATI_texture_float, R, 32bits'); writeln('twod_nv_rgba_32, // texture 2ds, NV_float_buffer, RGBA, 32bits'); writeln('twod_nv_rgba_16, // texture 2ds, NV_float_buffer, RGBA, 16bits'); writeln('twod_nv_r_32, // texture 2ds, NV_float_buffer, R, 32 bits'); writeln('Param 3: 0 = no comparison of results'); writeln(' 1 = compare and only print out max errors'); writeln(' 2 = compare and print out full result vectors (use withcare for large N)'); writeln('Param 4: problem size N '); writeln('Param 5: number of iterations '); writeln('press enter to exit'); readln; halt(0); end else begin mode := StrToInt( ParamStr(1) ); if (ParamStr(2) = 'rect_arb_rgba_32') then begin textureParameters := rect_arb_rgba_32; end else if (ParamStr(2) = 'rect_arb_rgba_16') then begin textureParameters := rect_arb_rgba_16; end else if (ParamStr(2) = 'rect_arb_r_32') then begin textureParameters := rect_arb_r_32; end else if (ParamStr(2) = 'rect_ati_rgba_32') then begin textureParameters := rect_ati_rgba_32; end else if (ParamStr(2) = 'rect_ati_rgba_16') then begin textureParameters := rect_ati_rgba_16; end else if (ParamStr(2) = 'rect_ati_r_32') then begin textureParameters := rect_ati_r_32; end else if (ParamStr(2) = 'rect_nv_rgba_32') then begin textureParameters := rect_nv_rgba_32; end else if (ParamStr(2) = 'rect_nv_rgba_16') then begin textureParameters := rect_nv_rgba_16; end else if (ParamStr(2) = 'rect_nv_r_32') then begin textureParameters := rect_nv_r_32; end else if (ParamStr(2) = 'twod_arb_rgba_32') then begin textureParameters := twod_arb_rgba_32; end else if (ParamStr(2) = 'twod_arb_rgba_16') then begin textureParameters := twod_arb_rgba_16; end else if (ParamStr(2) = 'twod_arb_r_32') then begin textureParameters := twod_arb_r_32; end else if (ParamStr(2) = 'twod_ati_rgba_32') then begin textureParameters := twod_ati_rgba_32; end else if (ParamStr(2) = 'twod_ati_rgba_16') then begin textureParameters := twod_ati_rgba_16; end else if (ParamStr(2) = 'twod_ati_r_32') then begin textureParameters := twod_ati_r_32; end else if (ParamStr(2) = 'twod_nv_rgba_32') then begin textureParameters := twod_nv_rgba_32; end else if (ParamStr(2) = 'twod_nv_rgba_16') then begin textureParameters := twod_nv_rgba_16; end else if (ParamStr(2) = 'twod_nv_r_32') then begin textureParameters := twod_nv_r_32; end else begin writeln('unknown parameter, exit'); writeln('press enter to exit'); readln; halt(ERROR_PARAMS); end; vIndex := StrToInt(ParamStr(3)); case vIndex of 0: begin mode_showResults := false; mode_compareResults := false; end; 1: begin mode_showResults := false; mode_compareResults := true; end; 2: begin mode_showResults := true; mode_compareResults := true; end; else writeln('unknown parameter, exit'); writeln('press enter to exit'); readln; halt(ERROR_PARAMS); end; ProblemSize := StrToInt( ParamStr(4)); numIterations := StrToInt( ParamStr(5)); writeln(textureParameters.name); writeln(', N=', ProblemSize, ', numIter=', numIterations); end; // calc texture dimensions if (textureParameters.texFormat = GL_RGBA) then begin texSize := Round( sqrt(ProblemSize/4.0) ); end else begin texSize := Round( sqrt(ProblemSize) ); end; // create data vectors SetLength( dataX, ProblemSize ); SetLength( dataY, ProblemSize ); // and fill with some arbitrary values for vIndex := 0 to ProblemSize-1 do begin dataX[vIndex] := 2.0; dataY[vIndex] := vIndex+1.0; end; alpha := 1.0/9.0; // init glut and glew, replaced by InitializeOpenGL InitializeOpenGL(); // init offscreen framebuffer initFBO(); // create textures for vectors createTextures(); // init shader runtime initCG(); // and start computation performComputation(); // compare results compareResults(); // and clean up cgDestroyProgram(fragmentProgram); cgDestroyContext(cgContext); glDeleteFramebuffersEXT(1,@fb); dataX := nil; dataY := nil; glDeleteTextures(2,@yTexID); glDeleteTextures (1,@xTexID); // cleanup any opengl here.// glutDestroyWindow(glutWindowHandle); CleanUpOpenGL; writeln('program finished');end;// "true main" wrapper to catch any exceptions ! and to pause screen afterrun.begin try Main; // calls "main" except on E:Exception do Writeln(E.Classname, ': ', E.Message); end; ReadLn;end.// *** End of Example ***Bye, Skybuck. |
|
|
|
|||
|
Skybuck Flying
Guest
Posts: n/a
|
This link with the very old documentation... mentions a way of creating a
dc... I was wondering how to manually create a dc... couldn't find any decent documentation at the time of porting the example... But maybe this link can help: http://msdn.microsoft.com/en-us/library/ms969905.aspx It also mentions how to do "flicker-free-drawing" could be interesting *if it works* that is. Who knows it might have already looked into this document in the past... maybe it works... maybe it doesn't... at least with delphi. But still gonna try it... tomorrow or so. This method doesn't actually create/setup it's own dc... but it more or less does a copy of other dc's it seems: CreateCompatibleDC(lpPS->hdc) I think there is a way to create a dc yourself... but that could either require a **** load of parameters to setup or maybe even writing your own driver which would be total overkill ?! ![]() But maybe a compatible dc might work ![]() Bye, Skybuck. |
|
|
|
|||
|
Skybuck Flying
Guest
Posts: n/a
|
Tried that it didn't help.
I researched this issue a little bit... an so far most people seem to believe that a "window" is actually needed for opengl and directx. A trick could be to hide the window. I want maximum speed so I am developing my own little Twindow class to do just that... However I won't include it in the example because I am not santa claus LOL and it would be a bit too much code and outside the scope of the tutorial ?! Or maybe not but screw you ! LOL. So four ways can be used to fix the tutorial: 1. Keep running the tutorial in Delphi will work. 2. Use my Twindow class and to-be TOpenGLWindow class which won't be available ! But this is an option for me lol.3. Use glut bullshit. 4. Use VCL bullshit ![]() I think it would be easiest to go with the VCL bullshit... not completely sure but I could give it a try... I need this tutorial to work because I want to do some benchmarking and figure out which format is fastest for my graphics card ! ![]() Bye, Skybuck. |
|
|
|
|||
|
Skybuck Flying
Guest
Posts: n/a
|
Ok,
I just updated the example... it now uses the vcl/form/canvas for the hdc. I also ran a little benchmark, problem size 16.000.000 iterations 200 Graphics card: GTX 7900. Some formats didn't work... especially the R program seemed to hang or be super slow... otherway, don't use those formats ![]() Conclusion: 16 bit floating point formats are twice as fast as 32 bit floating point formats ! So use 16 bit floating point formats when possible ?! Unfortunately this benchmark does not (yet?) include integers ?!? I am curious how 16 bit integers in the shaders would perform ?! ![]() Updated source + batchfile will be available shortly ![]() Here are the results from benchmark.bat :// BEGIN OF RESULTS (slightly modified to compensate for program hangs/crashes, see aborted): program started TEXRECT - float_ARB - RGBA - 32 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successfull. MFLOP/s for N=16000000: 5829 ReleaseDC successfull. program finished program started TEXRECT - float_ARB - RGBA - 16 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successfull. MFLOP/s for N=16000000: 11172 ReleaseDC successfull. program finished program started TEXRECT - float_ARB - R - 32 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successf ABORTED program started TEXRECT - float_ATI - RGBA - 32 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successfull. MFLOP/s for N=16000000: 5792 ReleaseDC successfull. program finished program started TEXRECT - float_ATI - RGBA - 16 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successfull. MFLOP/s for N=16000000: 11189 ReleaseDC successfull. program finished program started TEXRECT - float_ATI - R - 32 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successf ABORTED program started TEXRECT - float_NV - RGBA - 32 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successfull. MFLOP/s for N=16000000: 5795 ReleaseDC successfull. program finished program started TEXRECT - float_NV - RGBA - 16 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successfull. MFLOP/s for N=16000000: 11092 ReleaseDC successfull. program finished program started TEXRECT - float_NV - R - 32 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successfull. MFLOP/s for N=16000000: 5641 ReleaseDC successfull. program finished program started tex2D - float_ARB - RGBA - 32 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successfull. MFLOP/s for N=16000000: 5801 ReleaseDC successfull. program finished program started tex2D - float_ARB - RGBA - 16 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successfull. MFLOP/s for N=16000000: 11110 ReleaseDC successfull. program finished program started tex2D - float_ARB - R - 32 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successful ABORTED program started tex2D - float_ATI - RGBA - 32 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successfull. MFLOP/s for N=16000000: 5801 ReleaseDC successfull. program finished program started tex2D - float_ATI - RGBA - 16 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successfull. MFLOP/s for N=16000000: 11165 ReleaseDC successfull. program finished program started tex2D - float_ATI - R - 32 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successful ABORTED program started tex2D - float_NV - RGBA - 32 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successf ABORTED program started tex2D - float_NV - RGBA - 16 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successf ABORTED program started tex2D - float_NV - R - 32 , N=16000000, numIter=200 CreateDC successfull. CreateRenderingContext successfull ABORTED // END OF RESULTS Bye, Skybuck. |
|
|
|
|||
|
Skybuck Flying
Guest
Posts: n/a
|
Hello,
VCL Version Source + Executables + Benchmark batchfile now available ! (Old console version included too) If program seems to hang because of invalid format just terminate it... or fix it yourself ! ![]() Download location 1: http://cid-aedd0ea32d61bc86.skydrive...Delphi2007.zip Download location 2: http://members.home.nl/hbthouppermans/Delphi/Downloads/ I ****ing deliver you bitches ! LOL ![]() Even if it's crappy and rushed it works ![]() Bye, Skybuck =D |
|
|
|
|||
![]() |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| OpenGL libraries changed? | Ar Fai Ve | Apple | 1 | 11-13-2004 07:43 PM |
Powered by vBulletin®. Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.3.2 ©2009, Crawlability, Inc. |



Linear Mode

