Can you elaborate on what an attention model entails?
Machine Learning Engineer
Microsoft
Amazon
Databricks
Walmart
Arm
Klarna
Answers
Anonymous
7 months ago
An attention model is an example of a hyper network where the weights of model are determined by the input itself. In the attention mechanism, this occurs in that each token of the input sequence is compared with all others in the context window to determine the next token. The Attention mechanism by default does not care about the order of the input, which is ironic because of the success it has found in next token prediction. This is the basis for the LLMs. The prompt (which may be modified on the backend) will be used and then the next token will be predicted for the answer, and then the answer is built up token by token.
Interview question asked to Machine Learning Engineers interviewing at Virta, Motorola Solutions, Coupang and others: Can you elaborate on what an attention model entails?.