"How the GPU works" @c0de517e

I rediscovered a very good in-depth explanation on how GPU works published in 2008 on c0de517e blog:
Part 1 Part 2 Part 3

CUDA "Better Performance at Lower Occupancy" @GTC2010

A friend point me this very interesting talk at NVIDIA GTC:
Better Performance at Lower Occupancy

They deny two common fallacies that CUDA developer usually believe in:

  • Multithreading is the only way to hide latency on GPU
  • Shared memory is as fast as registers

All the GTC2010 presentations can be found there (with slides and videos !):

Old Real-Time GPU Raytracer

I just translated from French to English an old page on my website about a real-time GPU raytracer I developed for fun 4 years ago, during my Master Thesis. It is old school GPGPU in OpenGL and Cg that can run on an NV40 (GeForce 6800). No need for CUDA or a GF110 to do GPU raytracing ! ;-)
The application also features a slow and unoptimized CPU raytracer.

See there: http://www.icare3d.org/myprojects/opengl_projects/raytracer_gpu_full_1.0.html

PS: It is funny to see what was possible at this time, but it was developed quickly and the shader code itself is not a reference !

Fluid Simulation for Video Games @INTEL

There is a very interesting series of article about fluid simulations for video games written by Michael J. Gourlay on intel developer website. Source code is also provided.
Parts: 1, 2, 3, 4, 5, 6, 7, 8

CUDA 3.2 Final released

Download it on NVIDIA Developer website !

NVIDIA Fermi GPU and Architecture Analysis @Beyond3D

The article is 3 weeks old but I just read it. Beyond3D published a very good analysis of the Fermi architecture. It is based on many homemade tests they developed to bench individual parts of the GF100 chip. Based on these analysis, they made interesting discoveries and speculations on the GF100 architecture.

In this article, I also discovered "Pomegranate", a parallel hardware architecture for polygon rendering developed at Stanford and that seems to be very close to the way Fermi handle parallel work distribution of the different steps of the graphics pipeline. Pomegranate [Eldrige et al, 2000]

Discussions are on Beyond3D Forum.

Here are some interesting statements:

Read more »

First reviews of the NVIDIA GF110: GTX580

The GF110 is the new high-end GPU from NVIDIA based on a renewed Fermi architecture. Even if the chip has not been officially launched, reviews starts already to appear online !

In french:

It seems reviews are pretty good !
To sum-up, full speed FP16 texture filtering, Z-cull performances improved, architectural tweaks, 15-20% performance improvements over GTX480 in games, less power consumption, quieter, cooler.
According to techreport, an interesting subtle change is that the 16/48KB local storage partition can be configured by the driver for graphics contexts, while it was only configurable in compute on the GF100.

Congrats NVIDIA :-)

Texture and buffer access performance on Evergreen architecture @rastergrid.com

A very interesting article about textures and buffer access performances in OpenGL on AMD Evergreen architecture:

Various stuff from October

I did not have much time to update this blog lately, so here are some interesting stuff I did not post during October:

OpenGL SuperBible Fifth Edition

Last week, Addison Wesley kindly sent me a copy of the Fifth Edition of the OpenGL SuperBible so that I can write a review of it. So let's do that :-)

The OpenGL SuperBible has been a reference book since the first release and this fifth edition is the first edition to be exclusively focused on modern, shader based, OpenGL programming. That's the great novelty of this edition: it is based on the OpenGL 3.3 API and all discussions about deprecated fixed-function programming has been thrown out of the book.

Read more »
Copyright © Icare3D Blog
Designed by Templates Next | Converted into Blogger Templates by Theme Craft