You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that in this line, the image conditioning for query is concatenated before all_to_all, which means it's repeated by ulysses_size times. I wonder if that might affect correctness?
Thanks for your reply.
In case 2, joint_tensor_query has full sequence length before all_to_all, doesn't that mean it's duplicated in a interleaved fashion with image embedding along the seq dim (text_cond, image_seq_1, test_cond, image_seq_2, ...)?
This also seems to prevent pack qkv (can only pack kv now) due to seqlen diff.
Would it be better to cat after all_to_allinside ring comp?
It does waste some communication bandwidth for dimension flexibility. I encourage you to give a try for supporting var len, which is definitely valuable by avoiding text padding.
It seems that in this line, the image conditioning for query is concatenated before
data:image/s3,"s3://crabby-images/1296d/1296d52606b947d99a100ea3620da62a9da83d55" alt="image"
all_to_all
, which means it's repeated byulysses_size
times. I wonder if that might affect correctness?xDiT/xfuser/core/long_ctx_attention/hybrid/attn_layer.py
Line 108 in 57eb27f
The text was updated successfully, but these errors were encountered: