A friend point me this very interesting talk at NVIDIA GTC:
Better Performance at Lower Occupancy
They deny two common fallacies that CUDA developer usually believe in:
- Multithreading is the only way to hide latency on GPU
- Shared memory is as fast as registers
All the GTC2010 presentations can be found there (with slides and videos !):
PS: It is funny to see what was possible at this time, but it was developed quickly and the shader code itself is not a reference !
Download it on NVIDIA Developer website !
In this article, I also discovered "Pomegranate", a parallel hardware architecture for polygon rendering developed at Stanford and that seems to be very close to the way Fermi handle parallel work distribution of the different steps of the graphics pipeline. Pomegranate [Eldrige et al, 2000]
Discussions are on Beyond3D Forum.
Here are some interesting statements:
Read more »
A very interesting article about textures and buffer access performances in OpenGL on AMD Evergreen architecture:
I did not have much time to update this blog lately, so here are some interesting stuff I did not post during October:
- GPU-Assisted Malware : http://www.ics.forth.gr/dcs/Activities/papers/gpumalware.malware10.pdf
- Thrust 1.3 released : http://gpgpu.org/2010/10/07/thrust-v1-3-release
- OpenGL 4.1 drivers status : g-truc creation
- "Can CPUs Match GPUs on Performance with Productivity ?" : IBM Research
- GPU Technology Conference Session Video Archive : NVIDIA
- EASTL : An implementation of the C++ STL made by EA and optimized for video games usages