What is self-attention, and how does it work in transformers?

Question

What is self-attention, and how does it work in transformers?

1 Answer

rahuljain1 · Answer 1 · 2023-07-21T22:49:37+0000

Before the introduction of the Transformer model, the use of attention for neural machine translation was implemented by RNN-based encoder-decoder architectures. The Transformer model revolutionized the implementation of attention by dispensing with recurrence and convolutions and, alternatively, relying solely on a self-attention mechanism.

We will first focus on the Transformer attention mechanism in this tutorial and subsequently review the Transformer model in a separate one.

In this tutorial, you will discover the Transformer attention mechanism for neural machine translation.

After completing this tutorial, you will know:

How the Transformer attention differed from its predecessors
How the Transformer computes a scaled-dot product attention
How the Transformer computes multi-head attention

What is self-attention, and how does it work in transformers?

Please log in or register to answer this question.

1 Answer

Top Trending Technologies Questions and Answers

HOT LINKS

TRANDING TECHNOLOGIES

CONTACT US

Follow us on Social Media