Skip to content

Commit 5214dde

Browse files
authored
Merge pull request #10 from BinaryOracle/master
UPDATES
2 parents e5e3d54 + 671bd96 commit 5214dde

File tree

2 files changed

+16
-1
lines changed

2 files changed

+16
-1
lines changed

src/MMLLM/庖丁解牛BLIP2.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -335,7 +335,22 @@ class BertEmbeddings(nn.Module):
335335
```
336336
下图展示了 Image-Text Matching 的完整计算流程,关于BertModel的代码解析部分,将会在下文进行详细讲解:
337337

338-
![庖丁解牛BLIP2](庖丁解牛BLIP2/9.png)
338+
![Image-Text Matching](庖丁解牛BLIP2/9.png)
339+
340+
#### 3、Image-Grounded Text Generation (ITG Loss, GPT-like)
341+
342+
> - 目的:让Q-Former学习“图生文”的能力,即给定Input Image,生成Text
343+
>
344+
> - 自注意力掩码策略:Multimodal Causal Self-attention Mask(多模态因果自监督)
345+
>
346+
> - Queies 可以和所有自己的tokens做attention
347+
>
348+
> - Text 可以和所有的query tokens 及 当前token之前的text tokens做attention
349+
>
350+
> ![Multimodal Causal Self-attention Mask](庖丁解牛BLIP2/10.png)
351+
352+
353+
339354

340355

341356

src/MMLLM/庖丁解牛BLIP2/10.png

126 KB
Loading

0 commit comments

Comments
 (0)