ダウンロード
読み込み中…
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
6月 2020
DeBERTa (Decoding-enhanced BERT with disentangled attention) improves the BERT and RoBERTa models using two novel techniques. The first is the disentangled attention mechanism, where each word is represented using two vectors that encode its content and position, respectively, and the…