Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Hardware for Machine Learning, Exercises of Machine Learning

What does a modern machine learning pipeline look like? • Many different components. DNN training. Preprocessing of the training set.

Typology: Exercises

2022/2023

Uploaded on 05/11/2023

anala
anala 🇺🇸

4.3

(15)

259 documents

1 / 20

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Hardware for Machine
Learning
CS6787 Lecture 11 — Fall 2017
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14

Partial preview of the text

Download Hardware for Machine Learning and more Exercises Machine Learning in PDF only on Docsity!

Hardware for Machine

Learning

CS6787 Lecture 11 — Fall 2017

Recap: modern ML hardware

  • Lots of different types
    • CPUs
    • GPUs
    • FPGAs
    • Specialized accelerators
  • Right now, GPUs are dominant …we’ll get to why later

What does a modern machine learning pipeline

look like?

  • Many different components

DNN

training

Preprocessing of the training set

DNN

inference

New examples to be processed

Where can hardware help?

  • Everywhere!
  • There’s interest in using hardware everywhere in the pipeline
    • both adapting existing hardware architectures , and
    • developing new ones
  • What improvements can we get?
    • Lower latency inference
    • Higher throughput training
    • Lower power cost

Why are GPUs so popular for

machine learning?

Why are GPUs so popular for

training deep neural networks?

FLOPS: GPU vs CPU

  • FLOPS: f loating p oint o perations p er s econd From Karl Rupp’s blog https://www.karlrupp.net/2016/ /flops-per-cycle-for-cpus-gpus-and- xeon-phis/ This was the best diagram I could find that shows trends over time.
GPU FLOPS

consistently exceed CPU FLOPS Intel Xeon Phi chips are compute- heavy manycore processors that compete with GPUs

Memory bandwidth: CPU vs GPU

  • GPUs have higher memory bandwidths than CPUs
    • E.g. new NVIDIA Tesla V100 has a claimed 900 GB/s memory bandwidth
    • Wheras Intel Xeon E7 has only about 100 GB/s memory bandwidth
  • But, this comparison is unfair!
    • GPU memory bandwidth is the bandwidth to GPU memory
    • E.g. on a PCIE2, bandwidth is only 32 GB/s for a GPU

Challengers to the GPU

  • More compute-intensive CPUs
    • Like Intel’s Phi line — promise same level of compute performance and better handling of sparsity
  • Low-power devices
    • Like mobile-device-targeted chips
    • Configurable hardware like FPGAs and CGRAs
  • Accelerators that speed up matrix-matrix multiply
    • Like Google’s TPU

Will all computation become

dense matrix-matrix multiply?

What if dense matrix multiply takes over?

  • Great opportunities for new highly specialized hardware
    • The TPU is already an example of this
    • It’s a glorified matrix-matrix multiply engine
  • Significant power savings from specialized hardware
    • But not as much as if we could use something like sparsity
  • It might put us all out of work
    • Who cares about researching algorithms when there’s only one algorithm anyone cares about?

What if matrix multiply doesn’t take over?

  • Great opportunities for designing new heterogeneous, application-

specific hardware

  • We might want one chip for SVRG, one chip for low-precision
  • Interesting systems/framework opportunities to give users suggestions

for which chips to use

  • Or even to automatically dispatch work within a heterogeneous datacenter
  • Community might fragment
  • Into smaller subgroups working on particular problems

Recent work on hardware for

machine learning

Abstracts from papers at architecture conferences this year

Questions?

  • Conclusion
    • Lots of interesting work on hardware for machine learning
    • Lots of opportunities for interdisciplinary research
  • Upcoming things
    • Paper Review #10 — due today
    • Project proposal — due today
    • Paper Presentation #11 on Wednesday — TPU