in Amazon Elastic Compute Cloud EC2 by
Q:
When would I use Inf1 vs. C5 vs. G4 instances for inference in Amazon EC2?

▼ Show 1 Answer

0 votes
by

Customers running machine learning models that are sensitive to inference latency and throughput can use Inf1 instances for high-performance cost-effective inference. For those ML models that are less sensitive to inference latency and throughput, customers can use EC2 C5 instances and utilize the AVX-512/VNNI instruction set. For ML models that require access to NVIDIA’s CUDA, CuDNN or TensorRT libraries, we recommend using G4 instances.

Model Characteristics and Libraries Used EC2 Inf1 EC2 C5 EC2 G4
Models that benefit from low latency and high throughput at low cost X    
Models not sensitive to latency and throughput X  
Models requiring NVIDIA’s developer libraries X
Learn More with Madanswer

Related questions

0 votes
asked May 22, 2020 in Amazon Elastic Compute Cloud EC2 by Indian
...