Metrics such as Inception Score (IS), Frechet Inception Distance (FID), or human evaluations help in evaluating the quality, diversity, and realism of generated samples.