AMD Accelerated Processing Unit


The AMD Accelerated Processing Unit, formerly known as Fusion, is the marketing term for a series of 64-bit microprocessors from Advanced Micro Devices, designed to act as a central processing unit and graphics processing unit on a single die. APUs are general purpose processors that feature nearly discrete integrated graphics processors, which generally are a class above what would normally be termed as "integrated" graphics.
AMD announced the first generation APUs, Llano for high-performance and Brazos for low-power devices in January 2011. The second generation Trinity for high-performance and Brazos-2 for low-power devices were announced in June 2012. The third generation Kaveri for high performance devices were launched in January 2014, while Kabini and Temash for low-power devices were announced in the summer of 2013. Since the launch of the Zen microarchitecture, Ryzen APU's have released to the global market first as Raven Ridge on the DDR4 platform, after Bristol Ridge a year prior.
The Sony PlayStation 4 and Microsoft Xbox One eighth generation video game consoles both use semi-custom third generation low-power APUs.
Intel CPUs with integrated HD Graphics also have a CPU and GPU on a single die, but they do not offer HSA support.

History

The AMD Fusion project started in 2006 with the aim of developing a system on a chip that combined a CPU with a GPU on a single die. This effort was moved forward by AMD's acquisition of graphics chipset manufacturer ATI in 2006. The project reportedly required three internal iterations of the Fusion concept to create a product deemed worthy of release. Reasons contributing to the delay of the project include the technical difficulties of combining a CPU and GPU on the same die at a 45 nm process, and conflicting views on what the role of the CPU and GPU should be within the project.
The first generation desktop and laptop APU, codenamed Llano, was announced on 4 January 2011 at the 2011 CES show in Las Vegas and released shortly thereafter. It featured K10 CPU cores and a Radeon HD 6000-series GPU on the same die on the FM1 socket. An APU for low-power devices was announced as the Brazos platform, based on the Bobcat microarchitecture and a Radeon HD 6000-series GPU on the same die.
At a conference in January 2012, corporate fellow Phil Rogers announced that AMD would re-brand the Fusion platform as the Heterogeneous System Architecture, stating that "it's only fitting that the name of this evolving architecture and platform be representative of the entire, technical community that is leading the way in this very important area of technology and programming development." However, it was later revealed that AMD had been the subject of a trademark infringement lawsuit by the Swiss company Arctic, who used the name "Fusion" for a line of power supply products.
The second generation desktop and laptop APU, codenamed Trinity was announced at AMD's 2010 Financial Analyst Day and released in October 2012. It featured Piledriver CPU cores and Radeon HD 7000 Series GPU cores on the FM2 socket. AMD released a new APU based on the Piledriver microarchitecture on 12 March 2013 for Laptops/Mobile and on 4 June 2013 for desktops under the codename Richland. The second generation APU for low-power devices, Brazos 2.0, used exactly the same APU chip, but ran at higher clock speed and rebranded the GPU as Radeon HD7000 series and used a new IO controller chip.
Semi-custom chips were introduced in the Microsoft Xbox One and Sony PlayStation 4 video game consoles.
A third generation of the technology was released on 14 January 2014, featuring greater integration between CPU and GPU. The desktop and laptop variant is codenamed Kaveri, based on the Steamroller architecture, while the low-power variants, codenamed Kabini and Temash, are based on the Jaguar architecture. In November 2017, HP released the Envy x360, featuring the Ryzen 5 2500U APU, the first 4th generation APU, based on the Zen CPU architecture and the Vega graphics architecture.

Features

Heterogeneous System Architecture

AMD is a founding member of the Heterogeneous System Architecture Foundation and is consequently actively working on developing HSA in cooperation with other members. The following hardware and software implementations are available in AMD's APU-branded products:
TypeHSA featureFirst implementedNotes
Optimized PlatformGPU Compute C++ Support2012
Trinity APUs
Support OpenCL C++ directions and Microsoft's C++ AMP language extension. This eases programming of both CPU and GPU working together to process support parallel workloads.
Optimized PlatformHSA-aware MMU2012
Trinity APUs
GPU can access the entire system memory through the translation services and page fault management of the HSA MMU.
Optimized PlatformShared Power Management2012
Trinity APUs
CPU and GPU now share the power budget. Priority goes to the processor most suited to the current tasks.
Architectural IntegrationHeterogeneous Memory Management: the CPU's MMU and the GPU's IOMMU share the same address space.2014
PlayStation 4,
Kaveri APUs
CPU and GPU now access the memory with the same address space. Pointers can now be freely passed between CPU and GPU, hence enabling zero-copy.
Architectural IntegrationFully coherent memory between CPU and GPU2014
PlayStation 4,
Kaveri APUs
GPU can now access and cache data from coherent memory regions in the system memory, and also reference the data from CPU's cache. Cache coherency is maintained.
Architectural IntegrationGPU uses pageable system memory via CPU pointers2014
PlayStation 4,
Kaveri APUs
GPU can take advantage of the shared virtual memory between CPU and GPU, and pageable system memory can now be referenced directly by the GPU, instead of being copied or pinned before accessing.
System IntegrationGPU compute context switch2015
Carrizo APU
Compute tasks on GPU can be context switched, allowing a multi-tasking environment and also faster interpretation between applications, compute and graphics.
System IntegrationGPU graphics pre-emption2015
Carrizo APU
Long-running graphics tasks can be pre-empted so processes have low latency access to the GPU.
System IntegrationQuality of service2015
Carrizo APU
In addition to context switch and pre-emption, hardware resources can be either equalized or prioritized among multiple users and applications.

Feature overview

APU-branded platforms

AMD APUs have a unique architecture: they have AMD CPU modules, cache, and a discrete-class graphics processor, all on the same die using the same bus. This architecture allows for the use of graphics accelerators, such as OpenCL, with the integrated graphics processor. The goal is to create a "fully integrated" APU, which, according to AMD, will eventually feature 'heterogeneous cores' capable of processing both CPU and GPU work automatically, depending on the workload requirement.

TeraScale">TeraScale (microarchitecture)">TeraScale-based GPU

K10 architecture (2011): Llano

The first generation APU, released in June 2011, was used in both desktops and laptops. It was based on the K10 architecture and built on a 32 nm process featuring two to four CPU cores on a thermal design power of 65-100 W, and integrated graphics based on the Radeon HD6000 Series with support for DirectX 11, OpenGL 4.2 and OpenCL 1.2. In performance comparisons against the similarly priced Intel Core i3-2105, the Llano APU was criticised for its poor CPU performance and praised for its better GPU performance. AMD was later criticised for abandoning Socket FM1 after one generation.

Bobcat architecture (2011): Ontario, Zacate, Desna, Hondo

The AMD Brazos platform was introduced on 4 January 2011, targeting the subnotebook, netbook and low power small form factor markets. It features the 9-watt AMD C-Series APU for netbooks and low power devices as well as the 18-watt AMD E-Series APU for mainstream and value notebooks, all-in-ones and small form factor desktops. Both APUs feature one or two Bobcat x86 cores and a Radeon Evergreen Series GPU with full DirectX11, DirectCompute and OpenCL support including UVD3 video acceleration for HD video including 1080p.
AMD expanded the Brazos platform on 5 June 2011 with the announcement of the 5.9-watt AMD Z-Series APU designed for the Tablet market. The Desna APU is based on the 9-watt Ontario APU. Energy savings were achieved by lowering the CPU, GPU and northbridge voltages, reducing the idle clocks of the CPU and GPU as well as introducing a hardware thermal control mode. A bidirectional turbo core mode was also introduced.
AMD announced the Brazos-T platform on 9 October 2012. It comprised the 4.5-watt AMD Z-Series APU and the A55T Fusion Controller Hub, designed for the tablet computer market. The Hondo APU is a redesign of the Desna APU. AMD lowered energy use by optimizing the APU and FCH for tablet computers.
The Deccan platform including Krishna and Wichita APUs were cancelled in 2011. AMD had originally planned to release them in the second half 2012.

Piledriver architecture (2012): Trinity and Richland

;Trinity
The first iteration of the second generation platform, released in October 2012, brought improvements to CPU and GPU performance to both desktops and laptops. The platform features 2 to 4 Piledriver CPU cores built on a 32 nm process with a TDP between 65 W and 100 W, and a GPU based on the Radeon HD7000 Series with support for DirectX 11, OpenGL 4.2, and OpenCL 1.2. The Trinity APU was praised for the improvements to CPU performance compared to the Llano APU.
;Richland
The release of this second iteration of this generation was 12 March 2013 for and 5 June 2013 for .

[Graphics Core Next]-based GPU

Jaguar architecture (2013): Kabini and Temash

In January 2013 the Jaguar-based Kabini and Temash APUs were unveiled as the successors of the Bobcat-based Ontario, Zacate and Hondo APUs. The Kabini APU is aimed at the low-power, subnotebook, netbook, ultra-thin and small form factor markets, while the Temash APU is aimed at the tablet, ultra-low power and small form factor markets. The two to four Jaguar cores of the Kabini and Temash APUs feature numerous architectural improvements regarding power requirement and performance, such as support for newer x86-instructions, a higher IPC count, a CC6 power state mode and clock gating. Kabini and Temash are AMD's first, and also the first ever quad-core x86 based SoCs. The integrated Fusion Controller Hubs for Kabini and Temash are codenamed "Yangtze" and "Salton", respectively. The Yangtze FCH features support for two USB 3.0 ports, two SATA 6 Gbit/s ports, as well as the xHCI 1.0 and SD/SDIO 3.0 protocols for SD-card support.
Both chips feature DirectX 11.1-compliant GCN-based graphics as well as numerous HSA improvements.
They were fabricated at a 28 nm process in an FT3 ball grid array package by Taiwan Semiconductor Manufacturing Company, and were released on 23 May 2013.
The PlayStation 4 and Xbox One were revealed to both be powered by 8-core semi-custom Jaguar-derived APUs.

Steamroller architecture (2014): Kaveri

The third generation of the platform, codenamed Kaveri, was partly released on 14 January 2014. Kaveri contains up to four Steamroller CPU cores clocked to 3.9 GHz with a turbo mode of 4.1 GHz, up to a 512-core Graphics Core Next GPU, two decode units per module instead of one, AMD TrueAudio, Mantle API, an on-chip ARM Cortex-A5 MPCore, and will release with a new socket, FM2+. Ian Cutress and Rahul Garg of Anandtech asserted that Kaveri represented the unified system-on-a-chip realization of AMD's acquisition of ATI. The performance of the 45 W A8-7600 Kaveri APU was found to be similar to that of the 100 W Richland part, leading to the claim that AMD made significant improvements in on-die graphics performance per watt; however, CPU performance was found to lag behind similarly-specified Intel processors, a lag that was unlikely to be resolved in the Bulldozer family APUs. The A8-7600 component was delayed from a Q1 launch to an H1 launch because the Steamroller architecture components allegedly did not scale well at higher clock speeds.
AMD announced the release of the Kaveri APU for the mobile market on 4 June 2014 at Computex 2014, shortly after the accidental announcement on the AMD website on 26 May 2014. The announcement included components targeted at the standard voltage, low-voltage, and ultra-low voltage segments of the market. In early-access performance testing of a Kaveri prototype laptop, AnandTech found that the 35 W FX-7600P was competitive with the similarly-priced 17 W Intel i7-4500U in synthetic CPU-focused benchmarks, and was significantly better than previous integrated GPU systems on GPU-focused benchmarks. Tom's Hardware reported the performance of the Kaveri FX-7600P against the 35 W Intel i7-4702MQ, finding that the i7-4702MQ was significantly better than the FX-7600P in synthetic CPU-focused benchmarks, whereas the FX-7600P was significantly better than the i7-4702MQ's Intel HD 4600 iGPU in the four games that could be tested in the time available to the team.

Puma architecture (2014): Beema and Mullins