0 votes
in Amazon Elastic Compute Cloud EC2 by
When would I use Inf1 vs. C5 vs. G4 instances for inference in Amazon EC2?

1 Answer

0 votes
by

Customers running machine learning models that are sensitive to inference latency and throughput can use Inf1 instances for high-performance cost-effective inference. For those ML models that are less sensitive to inference latency and throughput, customers can use EC2 C5 instances and utilize the AVX-512/VNNI instruction set. For ML models that require access to NVIDIA’s CUDA, CuDNN or TensorRT libraries, we recommend using G4 instances.

Model Characteristics and Libraries UsedEC2 Inf1EC2 C5EC2 G4
Models that benefit from low latency and high throughput at low costX  
Models not sensitive to latency and throughputX 
Models requiring NVIDIA’s developer librariesX

Related questions

0 votes
asked May 22, 2020 in Amazon Elastic Compute Cloud EC2 by Indian
0 votes
asked May 26, 2020 in Amazon Elastic Compute Cloud EC2 by Indian
...