1. S - CUDA Intro
S - [nn libs]Reinvent primitives like matrix multiply, for example, from scratch T - C++ libs Many optimized neural-network libraries cuDNN for neural network primitives cuBLAS for linear algebra NCCL for multi-GPU communication S - [nn libs]Barrier to entry for Python developers most of the CUDA toolkit libraries are C++ based T - Python-based libraries built upon the C++ toolkit prefixed with “Cu” CuTile breaking large matrices on GPUs into smaller, more manageable sub-matrices called “tiles” take full advantage of the GPU’s parallelism without needing to manage low-level details manually CuPyNumeric a drop-in replacement for the popular numpy Python library offloading work to the GPU significant performance gains for compute-intensive tasks such as large-scale numerical computations, matrix operations, and data analysis R ...