- Wed Nov 22, 2006 2:17 pm
#196507
It's not about recompiling Maxwell for PS3 or CUDA... if you take a PC application and compile it with the PS3 toolchain you'll get very bad performance out of it.
The PS3 is not a forgiving platform. At all. It's very hard to program. It doesn't have 8 cores, it has 1 PowerPC CPU and 7 vector units. The vector units are somewhat smarter than what was found in the PS2 (it had a CPU and 2 vector units, VU0 and VU1), but they aren't made for running general-purpose code. Even the PPC CPU is in-order, so it can't hide memory latency issues or other expensive operations, like desktop CPUs do. You have to be really careful with your code and especially with your memory access patterns, since a cache miss means a MASSIVE stall.
It's very hard to keep a PS2 running at its full potential, and it only has 2 VUs; the PS3 has 7, so it's both a blessing and a curse. If you write code that does what the VUs were built to do, they can chew through data really quick. Even then, keeping them well fed can be tricky. To add to the fun, the tools that you have to use to measure and tweak the code performance are not for the weak-hearted.
Under these circumstances, it's highly unlikely that you can just re-compile Maxwell for the PS3 and get a performance boost. Also, don't forget that the PS3 has 256 MB of system memory, so try fitting your scenes into that (it has 256 MB of video memory as well, but you don't want to touch them with the CPU or VUs). Also, 1.8 of those 2 TFLOPs that they advertise are in the GPU, and that's another funny story.
Comparing the 128 shader processors in the G80 or the VUs with desktop CPU cores doesn't make too much sense. The GPU and the VUs are highly specialized processors, while an Intel is a general-purpose machine. CUDA sounds nice in theory, but a GPU is still a GPU. It doesn't have random memory writes, random memory reads are VERY expensive, branching is tricky. Wrapping an application around these constraints is no walk in the park. GPGPU is still a young field, there's a long way to go in terms of both developer experience and theoretical (algorithmic) background before you'll see Boeing doing wind tunnel simulation on Nvidia hardware. :)
PS: careful with that E3 2005 video, most of the "in-game" video clips at the end are actually offline renders. If you were impressed by the Killzone or Motorsport clips, think again. Here's what Motorsport really looks like on the real hardware. Try to see beyond the hype. ;)
The PS3 is not a forgiving platform. At all. It's very hard to program. It doesn't have 8 cores, it has 1 PowerPC CPU and 7 vector units. The vector units are somewhat smarter than what was found in the PS2 (it had a CPU and 2 vector units, VU0 and VU1), but they aren't made for running general-purpose code. Even the PPC CPU is in-order, so it can't hide memory latency issues or other expensive operations, like desktop CPUs do. You have to be really careful with your code and especially with your memory access patterns, since a cache miss means a MASSIVE stall.
It's very hard to keep a PS2 running at its full potential, and it only has 2 VUs; the PS3 has 7, so it's both a blessing and a curse. If you write code that does what the VUs were built to do, they can chew through data really quick. Even then, keeping them well fed can be tricky. To add to the fun, the tools that you have to use to measure and tweak the code performance are not for the weak-hearted.
Under these circumstances, it's highly unlikely that you can just re-compile Maxwell for the PS3 and get a performance boost. Also, don't forget that the PS3 has 256 MB of system memory, so try fitting your scenes into that (it has 256 MB of video memory as well, but you don't want to touch them with the CPU or VUs). Also, 1.8 of those 2 TFLOPs that they advertise are in the GPU, and that's another funny story.
Comparing the 128 shader processors in the G80 or the VUs with desktop CPU cores doesn't make too much sense. The GPU and the VUs are highly specialized processors, while an Intel is a general-purpose machine. CUDA sounds nice in theory, but a GPU is still a GPU. It doesn't have random memory writes, random memory reads are VERY expensive, branching is tricky. Wrapping an application around these constraints is no walk in the park. GPGPU is still a young field, there's a long way to go in terms of both developer experience and theoretical (algorithmic) background before you'll see Boeing doing wind tunnel simulation on Nvidia hardware. :)
PS: careful with that E3 2005 video, most of the "in-game" video clips at the end are actually offline renders. If you were impressed by the Killzone or Motorsport clips, think again. Here's what Motorsport really looks like on the real hardware. Try to see beyond the hype. ;)



- By Mark Bell