Skip to content

Commit

Permalink
add rethinking llm interp link
Browse files Browse the repository at this point in the history
  • Loading branch information
csinva committed Feb 6, 2024
1 parent da2b92e commit 3b20dba
Show file tree
Hide file tree
Showing 4 changed files with 20 additions and 2 deletions.
2 changes: 1 addition & 1 deletion _blog/misc/24_tensor_product_repr.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ Each tensor product results in a matrix for each pair, representing a 2D plane i
The composite tensor for the sentence "Cat chases mouse" is the sum of these individual tensor products.
Since the roles are orthogonal, it's easy to see that the unique contribution of each role-filler pair is preserved without interference (in different rows).

This example simplifies many aspects for clarity. In practice, the dimensions for roles and fillers might be much larger to capture more nuanced semantic features, and the mathematical operations might involve more sophisticated mechanisms to encode, manipulate, and decode the structured representations effectively.
This example simplifies many aspects for clarity. In practice, the dimensions for roles and fillers might be much larger to capture more nuanced semantic features, and the mathematical operations might involve more sophisticated mechanisms to encode, manipulate, and decode the structured representations effectively. See another [example here](https://rtmccoy.com/tpdn/tpr_demo.html) (it's focused on applying TPRs to RNN representations).

**Notes**
- Learning in TPRs involves optimizing the filler and role vectors to optimize the reconstruction of input structures from their TPRs, achievable through gradient descent or other techniques
Expand Down
11 changes: 11 additions & 0 deletions _includes/01_research.html
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,17 @@ <h2 style="text-align: center; margin-top: -150px;"> Research
</tr>
</thead>
<tbody>
<tr>
<td class="center">'24</td>
<td>Rethinking Interpretability in the Era of Large Language Models
</td>
<td>singh et al.</td>
<td class="med">🔎🌀</td>
<td class="center"><a href="https://arxiv.org/abs/2402.01761">arxiv</a></td>
<td class="big"></td>
<td class="med">
</td>
</tr>
<tr>
<td class="center">'24</td>
<td>Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs
Expand Down
8 changes: 7 additions & 1 deletion _notes/neuro/comp_neuro.md
Original file line number Diff line number Diff line change
Expand Up @@ -637,8 +637,14 @@ subtitle: Diverse notes on various topics in computational neuro, data-driven ne

- TPR of a structure is the sum of the TPR of its constituents
- tensor product operation allows constituents to be uniquely identified, even after the sum (if roles are linearly independent)

- [TPR intro blog post](https://csinva.io/blog/misc/24_tensor_product_repr)
- [TPR slides](https://www.mit.edu/~jda/teaching/6.884/slides/oct_02.pdf)
- RNNs Implicitly Implement Tensor Product Representations ([mccoy...smolensky, 2019](https://arxiv.org/pdf/1812.08718.pdf))
- introduce TP Decomposition Networks (TPDNs), which use TPRs to approximate existing vector representations
- assumes a particular hypothesis for the relevant set of roles (e.g., sequence indexes or structural positions in a parse tree)

- TPDNs can successfully approximate linear and tree-based RNN autoencoder representations


## synaptic plasticity, hebb's rule, and statistical learning

Expand Down
1 change: 1 addition & 0 deletions _notes/research_ovws/ovw_transformers.md
Original file line number Diff line number Diff line change
Expand Up @@ -728,6 +728,7 @@ See related papers in the [📌 interpretability](https://csinva.io/notes/resear
- $C = C(x)$
- Tree Transformer: Integrating Tree Structures into Self-Attention ([wang, .., chen, 2019](https://arxiv.org/pdf/1909.06639.pdf))
- Waveformer: Linear-Time Attention with Forward and Backward Wavelet Transform ([zhuang...shang, 2022](https://arxiv.org/abs/2210.01989))
- White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is? ([yaodong yu...yi ma, 2023](https://arxiv.org/abs/2311.13110))


## model merging / mixture of experts (MoE) / routing
Expand Down

0 comments on commit 3b20dba

Please sign in to comment.