ARM’s Neoverse V2 core, Demeter, has been optimised for customer server workloads including the BERT ML framework.
The flow of the BF16 instruction was tuned to boost performance and the instruction cache for the core is doubled to 2Mbits.
Arm has also increased the vector performance of the core with four lanes of 128bit wide interconnect for the scalable vector extensions (SVE2).
“We have added icache coherency, workload specific optimisation eg BERT, larger cache and 48bit physical addressing to the V line for cloud workloads,” said Dermot O’Driscoll, vice president of product solutions.
V3, Poseidon, which is due next year, will include the CXL3.0 standard for memory interconnect.
This is driven by the need for customisation for workloads in the data centre, combining CPU cores with AI accelerator cores and intelligent interconnect, either as a single chip or as chiplets.
Here the interconnect is key, and the V2 and V3 will use the CMN-700 interconnect, based on ARM’s AMBA Coherent Hub Interface (CHI).
This will work with the CXL memory interconnect standards and with the UCIe chiplet protocol.
See also: Nvidia Grace super processor adopts Arm Neoverse V2
Contact:
Phone:
E-mail: info@valuedcomponents.com
Add: 2A1110, Shine City, Longgang Blvd 1099, Longcheng Street, Longgang, Shenzhen 518172, China