FRONTIER's node configuration supercomputer genealogy (1/3)
Intel's roadmap update has been settled down, and the next is waiting for Intel Innovation to be held at the end of October.To tell the truth, there are some stories about Motherboards compatible with LGA 1700 overseas, but it seems to be far away from product announcements.
Anyway, I came here and talked about HPC came out, so this time I will introduce this together.
AMD announces that HPC's performance efficiency will be 30 times by 2025
On September 29, AMD announced by 2025 that it would increase the performance efficiency of HPC and AI workloads by 30.This is a platform as of 2020, so compared to the second -generation EPYC, the 2025 platform will increase the performance/power consumption ratio by 30 times.By the way, it is not clear whether the GPU that can be combined is Radeon Instinct MI100 or Radeon Instinct MI50, the previous generation product.
Thirty times, it's not so difficult for AI.In the first place, it is more accurate for the EPYC or Radeon Instinct to say that it is still not supported in terms of AI, and there is a mechanism that has been added to BF16 but still perform AI processing efficiently.It has not been.
Speaking of Intel, it is a mechanism equivalent to Matrix Engine mounted on VNNI and Xe Core, and it is not so difficult to achieve about 10 times the efficiency of the current situation.Rather, the current situation should be too low.
If you accumulate this and processing and improving the circuit, it will be possible to achieve 30 times (not easy).Rather, the difficulty is the HPC field, which needs to be considered an accelerator, such as a large Matrix Engine, which is equivalent to Intel's AMX.
現状CDNAにはインテルのXeのMatrix Engineや、NVIDIAのTensor Coreにあたるものが実装されていないので、このあたり(おそらくはNVIDIAのTensor Core Gen2に近い、FP64の行列演算が可能なもの)を実装してくる形で対応すると思われる。
As an aside, it is somewhat suspicious that AMD supports VNNI.Because VNNI is effectively linked to Openvino and OneAPI for AI for AI, can AMD support ONEAPI anyway?That's because it is a story.Therefore, it seems that the AI accelerator order is installed in the form of its own implementation, and it can be used via ROCM.
The configuration announced by AMD estimates the Frontier node configuration from there.
By the way, the story so far is simply a pillow.There was an explanation of AMD's SVP and researcher, Sam Naffziger, for its performance efficiency 30 times.The image below is the image below.
"2 compared to industry standards.The enthusiasm of achieving a 5 -fold improvement rate is also amazing.
The composition of the problem.Now how to look at this
What is this?It seems to be a "just one example", but the Node of the Super Computer Frontier, which AMD and CRAY (now HPE) deliver to the Oakridge National Institute in 2022, will still be in the 1x EPYC+4x Radeon Instinct configuration.
This is from the "Node Diagram" published on the Frontier page of the Oakridge National Research Institute.
In the area where the composition of this node is very similar to the previous composition, it is necessary to think that Naffziger's image is based on Frontier configuration.
The figure below is the presuppearance of the Frontier node configuration.First of all, EPYC will not be in time, so it will actually be Milan -based.Rather, the final configuration may be based on GENOA, but the GENOA base is quite impossible when delivered in 2021.
Frontier node configuration estimation
In the figure, the DDR5 memory is connected, but this is also a DDR4 base when the Milan base is first delivered, and it is likely that the board will be switched to the DDR5 base when it is later updated to GENOA.The timeline on the operation of Frontier was described in the series 510 times, but it was installed in late 2021 to early 2022, and was in late 2022.
Frontier introduction schedule.CY is an abbreviation of Calender Year
In other words, it is an ant to operate the system based on Milan and update the processor board on the way to Genoa base.
On the other hand, Radeon Instinct is considered to be a completely different custom version from the current Mi100.This seems to be at least the following configuration as I/F.