In the previous article, we reviewed several approaches to applying attention to vision models. We will continue our discussion, present a few additional vision models approaches in this article, and discuss their advantages over traditional approaches. Stand-Alone Self Attention Ramachandran et al. 2019 proposed an attention mechanism like the 2-D attention in Image Transformer [1].