Both **shallow and deep networks** are good enough and capable of approximating any function. But for the same level of accuracy, **deeper networks **can be much more efficient in terms of computation and number of parameters. Deeper networks can create deep representations. At every layer, the network learns a new, more **abstract representation** of the input.