12.4.1 : Keynotes



Keynotes : https://www.youtube.com/watch?v=DiGB5uAYKAg :
  • À 21h00 GTC 2024 Keynote [S62542]
  • Slides
  • TSMC production avec cuLitho
  • Blackwell (David Blackwell) est un GPU mais surtout une plateforme , 674 ExaFlops
  • 2 Dies connected together with 10TB/s bandwidth => behaves as one single chip
  • when people said Blackwell was beyond the laws of physics, engineers said : "So what ?"
  • Nvidia named the GPU architecture for mathematician David Harold Blackwell, the first Black inductee into the U.S. National Academy of Sciences.
  • David Harold Blackwell (April 24, 1919 – July 8, 2010) was an American statistician and mathematician who made significant contributions to game theory, probability theory, information theory, and statistics. He is one of the eponyms of the Rao–Blackwell theorem. He was the first African American inducted into the National Academy of Sciences, the first African American full professor (with tenure) at the University of California, Berkeley, and the seventh African American to receive a Ph.D. in mathematics. In 2012, President Obama posthumously awarded Blackwell the National Medal of Science. Blackwell was also a pioneer in textbook writing. He wrote one of the first Bayesian statistics textbooks, his 1969 Basic Statistics. By the time he retired, he had published over 90 papers and books on dynamic programming, game theory, and mathematical statistics.
  • 208Gtransitors
  • No Memory issue, cache issue
  • 2nd generation transformer engine to rescale and recast to lower precision automatically when possible
  • Too much components running at the same time => RAS engine : Reliability engine (Reliability, Availability and Serviceability) => Reliable with automatic check tester in cluster (all bit, all gates check)
  • Encrypt data, even on network
  • High speed compression engine
  • All functionnalities tries to keep Blackwell as busy as possible
  • NVlink Switch : 50B Transistors, 72-Pots dual 200 Gb/s SerDes, 4 NVlink at 1.8 TB/s, 7.2 TB/s full duplex bandwidth (GPU all to all at full speed), Sharp-In Network compute 3.6 TFlops FP8
  • First 1EFlops in one rack (5000 NVlink cables, 2 Miles, 3.2 km)
    • Normally 2-20 kW of transiver (transmitter + recievers in Optics) => saved with NVlink
    • Water in 25°C, out 45°C and 2l per second
    • (3000 Pounds, 1.5 Tonnes, and 600000 parts)
  • 2000 GPU -> 2MW
  • Everything at NVidia as a digital Twins
  • Weather simulation with CoreDiff : 25km -> 2km
  • NIM : NVidia Inference Microservice => software distributed inside containers
    • Chip designer chatbot
  • Omniverse cloud streams to Apple Vision Pro
  • Thor used by BYD
  • ISAAC perceptor
  • Jetson -> Blackwell pour l'embarqué (il y a aussi des version avec Hopper et Ampere)