According to the neural processing unit maker on Thursday, the two companies will evolve FuriosaAI's proprietary tensor contraction processor architecture into a multi-die chiplet system, building an engine tuned for the high-volume token processing that global hyperscale data centers increasingly demand.
The collaboration builds on FuriosaAI's second-generation accelerator, RNGD, a 180-watt PCIe chip now in mass production.
Fabricated on TSMC's 5-nanometer process and paired with SK hynix HBM3 memory, the accelerator is optimized for large language models and agentic AI workloads, and has already been validated in production at customers including Samsung SDS and LG AI Research.
The third-generation accelerator will carry a 2-nanometer compute die and HBM4/4E memory, drawing on Broadcom's advanced packaging to fuse multiple silicon dies into a single high-performance chip.
The two firms plan to begin sampling the chip in the first half of 2028, as surging demand for agentic AI pushes inference workloads to outpace the training tasks that first fueled the generative AI boom.
Copyright ⓒ Aju Press All rights reserved.