Old school pimp
10th February 2004
/GameStar/dev: So where's the technical journey leading?
Cevat Yerli: In general it's leading to multi-threading, so multi-core and multi-cpu, a streaming architecture, cross platform, and for each pixel there will be at least one shader.
/GameStar/dev: Are you staying with SM3.0 or are you jumping to SM4.0 straight away?
Cevat Yerli: We're gonna support SM2.0 and above
/GameStar/dev: And above?
Cevat Yerli: And above
/GameStar/dev: The new PC processors and the upcoming consoles are heavily applying multithreading. How are you gonna utilize the potential?
Cevat Yerli: We're scaling the individual modules , like animation, physics and parts of the graphics with the cpu, depending how many threads the hardware has to offer. We're going to support both multi-cpu systems and multithreading and multicore. With 3 cpu's with 2 hardware threads each (dual core cpu's) it's possible that we are going to scale for 6 threads. Maybe we're not gonna do it though, depending how fast the individual cores or the cpu-threads are running respectively. We're developing a system that's analyzing how much threading power is available and we are going to scale accordingly.
/GameStar/dev: x86, PowerPC and PowerPC + Cell. All architectures have their own threading organization ...
Cevat Yerli: The 360 solution resembles Hyper threading. In principle it's 2 cpu's with 2 Hyper threads each. If you're asking the hardware manufacturers, that's not the case though. But analyzing it from a software developer’s standpoint it's no different from hyper threading. That means that you're supposed to have 6 threads, but it's only 1.5 threads by 3 in reality. With PS3's cell things are looking differently: the main cpu has 2 threads (slightly better than hyper threading) and then you're getting the synergetic processors. The 8th spu was cut.
Cevat Yerli: A pure manufacturing reason. The spu's are not as flexible as your conventional cpu, that's why we have to scale differently.
/GameStar/dev: Between the individual architectures it's almost impossible to port the whole code. You have to at least modify the low level commands. x86 and PowerPC are different, but how easily can you port from one console to the other considering they're both PowerPC based?
Cevat Yerli: You can't port it either really. We're the only german developer that has a PS3 dev kit I believe. Accordingly we can look at the hardware in real life rather than speculating about it. The PS3 is a system that needs further adaptions that are especially written for the PS3 architecture - simply porting just isn't working.
/GameStar/dev: So Sony's claim - you have a nice layer, your throwing your code at it, and everything is working beautifully on Cell - is not the case?
Cevat Yerli: They wish. But it's a long way off that still. The devkits aren't that far either yet. Based on the information on the devkits you have to do a lot of low level work, to get something out of this hardware.
/GameStar/dev: Regarding interprocess-communication: How relevant are the differences between the different multi-threading approaches?
Cevat Yerli: An important question. The following things are very important for hardware based multithreading: Are the threads running on real cores (do they have their own registry set)? Or is there a hardware abstraction like with the PowerPC, where the -two in this case- threads have their own registry sets, but they are still on the same core, so with the issuing of instructions to individual units, both threads can't work at the same time. Multi-threading at hyper threading is only trying to distribute the instructions to the superscalar units (math-operations to integer and float units, load store etc.). With multiple cores the question of bus-connection to the peripherals and to the main memory. How is the cache implemented, are all threads sharing the same cache? Plus: shared memory vs. stand alone local memory. A complete PowerPC core is also part of the cell system. But it's not only an independed processing unit, it's also the host for the individual cell cores. In the cell architecture you have apart from the PowerPC core also individual cell cores that are connected through an ultra-high end-bus and can communicate with each other independently. If you exploit the optimized parallelizing, you can achieve a linear scaling over all cell cores.
/GameStar/dev: How are multi-core systems behaving on single-core pc's?
Cevat Yerli: As developers we are in the worst possible situation. We have to support 32 and 64 bit, Single and Multicore cpu's, single and multithreading, cell and not-cell and OpenGL and DirectX. The expenditure to develop one technology, that is utilizing all those parameters perfectly is extremely high. The technical expenditure is at least twice as highs than CryEngine 1.
/GameStar/dev: The step from 32 to 64 bit was technically surely easier than going from single to multithreading.
Cevat Yerli: Unfortunately there is no step. We cannot really take the step, we have to support both.
/GameStar/dev: Are there performance issues with the multi-threaded CryEngine 2 running on single core pc's?
Cevat Yerli: The code can run sequentially. You're losing a bit of efficiency, but what you are gaining with optimization is higher. SO the price of sustaining a loss of the frame rate when running on a single-thread pc is so small, that you can easily get it back from that. Most of the PC games are not optimized anyway, Far Cry isn't either.
/GameStar/dev: With consoles, developers are getting astounding performance out of average hardware, because they have to. If this was the case with pc's you'd probably only need a Geoforce4 TI for running Doom 3.
Cevat Yerli: My point exactly. The evolution of hardware is running at such a fast rate, that you don't get to work with it for long. It's the same with cpu's, you have to take your time to optimize. The biggest problem with that are the cache misses. Also, you should avoid a global memory between the individual threads. Simply put: if we are reaching into the same pot, the pot must not change. If I am reaching for an element before you, you are not getting it anymore - or not the one that you expected at the very least. To bypass this you ideally have to change something in one step, pass the result on to the open memory and release it for other cpu's (unlocking).
/GameStar/dev: How to you keep the physics consistent with 6 threads.
Cevat Yerli: You could solve that not only technically but also creatively. You have to go new ways: How can we, and that's a main thing, how can we scale our game from 1 to 8 threads qualitatively. So that the game-play stays the same, but the game is either looking better or I can play it better. On PC it will only be cosmetically changes, because everyone has the power, but then there are big differences in game play. On consoles you can get better game play out of optimizing. That's why you have to test to scales as a multi-platform developer: FX and game play. On a PC you often only optimize FX (higher resolution etc.). We want to scale both FX and game play though, for example from the intensity of AI and shaders.
/GameStar/dev: How does porting work from x86 to X360?
Cevat Yerli: In general the architecture is different of course, but they are quite a bit more similar then Xbox360 and PS3. The cpu's of PC, 360 and PS3 have only one similarity - multi-threading. As a generic cpu the Xbox 360 processor is the most powerful one, if you're taking the 7 Spu's of the PS3 into account things change. Before we had the PS3 devkits we thought PS3 and Xbox 360 were closer in design than PC and console. That's not the case though *laughs*
/GameStar/dev: So you are optimizing your engine for the cell spu's?
Cevat Yerli: Sure. We have to, because we want to utilize PS3's power in full. Accordingly the PS3 will get it's own engine architecture, kind of a sub-architecture of CryEngine 2.
/GameStar/dev: You outsourced the Far Cry Instincts port. Now you have to port yourselves?
Cevat Yerli: Umh ...
/GameStar/dev: ... the graphics interfaces are requiring an extra effort?
Cevat Yerli: Yeah, because of OpenGL ES for the PS3 we have to recode our whole rendering. If you look at it closely, CryEngine 2 will have 2 solutions for each system in total. If a developer abstracts that, the technology is optimized very specifically. Otherwise you cannot utilize the whole power. Alternatively you can abstract it in a way that it is not running on all system, then the strongest platform is losing out the most ...