The linear algebraic computations performed on their GPU’s tensor cores (since the Turing era) combined with their CUDA and cuDNN software stack have the fastest performance in training deep neural network algorithms.
That may not last forever, but it’s the best in terms of dollars per FLOPS an average DNN developer like myself has access to currently.
The linear algebraic computations performed on their GPU’s tensor cores (since the Turing era) combined with their CUDA and cuDNN software stack have the fastest performance in training deep neural network algorithms.
That may not last forever, but it’s the best in terms of dollars per FLOPS an average DNN developer like myself has access to currently.