Is attention mask wrong for batch generation?

#33

by qingsonglv - opened Apr 10, 2023

Apr 10, 2023

For batch generation, the attention_mask is set to a single 1 referring to this line: https://huggingface.co/THUDM/chatglm-6b/blob/main/modeling_chatglm.py#L948

However, for a batch with various lengths, the left padded tokens are not masked in this case.

Apr 10, 2023

position id has the same problem I guess.

Apr 10, 2023

•

Apr 11, 2023

seems like my fault... there's no bug

qingsonglv changed discussion status to closed Apr 11, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment