From 3b20dbaaa89b2d8b718e214c16fe6891d64fce76 Mon Sep 17 00:00:00 2001
From: Chandan Singh <chandan_singh@berkeley.edu>
Date: Mon, 5 Feb 2024 21:27:05 -0500
Subject: [PATCH] add rethinking llm interp link

---
 _blog/misc/24_tensor_product_repr.md     |  2 +-
 _includes/01_research.html               | 11 +++++++++++
 _notes/neuro/comp_neuro.md               |  8 +++++++-
 _notes/research_ovws/ovw_transformers.md |  1 +
 4 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/_blog/misc/24_tensor_product_repr.md b/_blog/misc/24_tensor_product_repr.md
index 44ba64ad..8186ec92 100755
--- a/_blog/misc/24_tensor_product_repr.md
+++ b/_blog/misc/24_tensor_product_repr.md
@@ -54,7 +54,7 @@ Each tensor product results in a matrix for each pair, representing a 2D plane i
 The composite tensor for the sentence "Cat chases mouse" is the sum of these individual tensor products.
 Since the roles are orthogonal, it's easy to see that the unique contribution of each role-filler pair is preserved without interference (in different rows).
 
-This example simplifies many aspects for clarity. In practice, the dimensions for roles and fillers might be much larger to capture more nuanced semantic features, and the mathematical operations might involve more sophisticated mechanisms to encode, manipulate, and decode the structured representations effectively.
+This example simplifies many aspects for clarity. In practice, the dimensions for roles and fillers might be much larger to capture more nuanced semantic features, and the mathematical operations might involve more sophisticated mechanisms to encode, manipulate, and decode the structured representations effectively. See another [example here](https://rtmccoy.com/tpdn/tpr_demo.html) (it's focused on applying TPRs to RNN representations).
 
 **Notes**
 - Learning in TPRs involves optimizing the filler and role vectors to optimize the reconstruction of input structures from their TPRs, achievable through gradient descent or other techniques
diff --git a/_includes/01_research.html b/_includes/01_research.html
index b0dbd765..795a8f04 100755
--- a/_includes/01_research.html
+++ b/_includes/01_research.html
@@ -184,6 +184,17 @@ <h2 style="text-align: center; margin-top: -150px;"> Research
         </tr>
     </thead>
     <tbody>
+        <tr>
+            <td class="center">'24</td>
+            <td>Rethinking Interpretability in the Era of Large Language Models
+            </td>
+            <td>singh et al.</td>
+            <td class="med">🔎🌀</td>
+            <td class="center"><a href="https://arxiv.org/abs/2402.01761">arxiv</a></td>
+            <td class="big"></td>
+            <td class="med">
+            </td>
+        </tr>
         <tr>
             <td class="center">'24</td>
             <td>Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs
diff --git a/_notes/neuro/comp_neuro.md b/_notes/neuro/comp_neuro.md
index ce77f9bf..4355de3b 100755
--- a/_notes/neuro/comp_neuro.md
+++ b/_notes/neuro/comp_neuro.md
@@ -637,8 +637,14 @@ subtitle: Diverse notes on various topics in computational neuro, data-driven ne
 
   - TPR of a structure is the sum of the TPR of its constituents
     - tensor product operation allows constituents to be uniquely identified, even after the sum (if roles are linearly independent)
-
+- [TPR intro blog post](https://csinva.io/blog/misc/24_tensor_product_repr)
 - [TPR slides](https://www.mit.edu/~jda/teaching/6.884/slides/oct_02.pdf)
+- RNNs Implicitly Implement Tensor Product Representations ([mccoy...smolensky, 2019](https://arxiv.org/pdf/1812.08718.pdf))
+  - introduce TP Decomposition Networks (TPDNs), which use TPRs to approximate existing vector representations
+    - assumes a particular hypothesis for the relevant set of roles (e.g., sequence indexes or structural positions in a parse tree)
+
+  - TPDNs can successfully approximate linear and tree-based RNN autoencoder representations
+
 
 ## synaptic plasticity, hebb's rule, and statistical learning
 
diff --git a/_notes/research_ovws/ovw_transformers.md b/_notes/research_ovws/ovw_transformers.md
index d5133358..b00e859f 100644
--- a/_notes/research_ovws/ovw_transformers.md
+++ b/_notes/research_ovws/ovw_transformers.md
@@ -728,6 +728,7 @@ See related papers in the [📌 interpretability](https://csinva.io/notes/resear
       - $C = C(x)$
   - Tree Transformer: Integrating Tree Structures into Self-Attention ([wang, .., chen, 2019](https://arxiv.org/pdf/1909.06639.pdf))
   - Waveformer: Linear-Time Attention with Forward and Backward Wavelet Transform ([zhuang...shang, 2022](https://arxiv.org/abs/2210.01989))
+- White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is? ([yaodong yu...yi ma, 2023](https://arxiv.org/abs/2311.13110))
 
 
 ## model merging / mixture of experts (MoE) / routing