several types of attention modules written in PyTorch for learning purposes
transformers pytorch transformer attention attention-mechanism softmax-layer multi-head-attention multi-query-attention grouped-query-attention scale-dot-product-attention
-
Updated
Oct 1, 2024 - Python