
The only time two threads within a Bulldozer Module could clash, we were told, was if each required 256-bit floating point precision, for example if both threads used the new 256-bit AVX capabilities of the CPU. Threads that require 128-bit precision, or can be ganged if a 256-bit fp precision The two 128-bit FMAC units can work independently if the Module is porcessing two Seite also confirmed that as part of the decoupling of the Fetch and Decode units in the front-end of a Bulldozer module (an innovation over previous designs), the front-end unit can accept two threads of work simultaneously and conduct simultaneous sequencing of these threads. Seite did admit that the two execution cores within a Bulldozer module shared more than the floating point unit (the FMAC Scheduler and the unit itself that’s split into two 128-bit halves): ‘ we’re sharing the front end, the Level 2 cache – and there could be conflicts of course, because we have two cores.’ But the Level 2 cache is a healthy 2MB in size to compensate: ‘ To avoid conflicts (in L2), we change the associativity, we change also the size… By having different types of associativity – more ways in the associativity – by having a bigger cache – you are avoiding this problem.


The Bulldozer Module will never be negative – you have two threads, and the two threads are not going to clash.’ Sometimes you have negative impact, but most of the time, you have something which is in between zero and 40. We asked Bernard Seite, technical advisor, AMD, whether we really should regard the two execution units within a Bulldozer Module as cores and were told, ‘ If you take the overall group of applications that are running on x86, 90 per cent is integer… We look at how efficient Hyper-Threading. AMD is calling these units ‘cores’ as they’re capable of processing the integer workloads that most cores spend most of their time processing, but this initially seems more of a marketing term.Īfter all, a Bulldozer ‘core’ isn’t as capable as a Phenom II core, for example, as each Phenom II core can fetch and decode work and doesn’t have to share its floating point capabilities with anything else.

We’ll start with the most striking, controversial and intriguing aspect of Bulldozer’s design: the pair of execution units within a Module.
