Skip to content

Commit

Permalink
notes update
Browse files Browse the repository at this point in the history
  • Loading branch information
csinva committed Feb 26, 2024
1 parent 9e8ab2b commit 05816cd
Show file tree
Hide file tree
Showing 5 changed files with 31 additions and 10 deletions.
2 changes: 1 addition & 1 deletion _includes/00_about.html
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ <h1 class="brand-heading">&nbsp;</h1>
href="https://www.microsoft.com/en-us/research/group/deep-learning-group/">deep
learning
group</a>)
<!-- ; phd from berkeley (with prof. <a href="https://binyu.stat.berkeley.edu/">bin yu</a>) -->
; phd from berkeley (with prof. <a href="https://binyu.stat.berkeley.edu/">bin yu</a>)
</p>

</div>
Expand Down
15 changes: 7 additions & 8 deletions _includes/01_research.html
Original file line number Diff line number Diff line change
Expand Up @@ -17,29 +17,27 @@ <h2 style="text-align: center; margin-top: -150px;"> Research</h2>
</style>
<div style="padding-left: 5%;padding-right: 5%">
<div style="width: 100%;padding: 8px;margin-bottom: 20px; text-align:center; font-size: large;">
Here are some areas I'm currently excited about. If you want to chat about research (or
are interested in interning at MSR), feel free to reach out over email :)</div>
Some areas I'm currently excited about. If you want to chat about research or
are interested in interning at MSR, feel free to reach out over email :)</div>

<div class="research_box"><strong>🔎
Interpretability.</strong> I'm interested in <a href="https://arxiv.org/abs/2402.01761">rethinking
interpretability</a> in the context of LLMs (collaboration with many folks, particularly <a
href="https://www.microsoft.com/en-us/research/people/rcaruana/">Rich Caruana</a>).
interpretability</a> in the context of LLMs
<br>
<br>
<a href="https://www.nature.com/articles/s41467-023-43713-1">augmented imodels</a> - use LLMs to build a
transparent model<br>
<a href="https://github.com/csinva/imodels">imodels</a> - build interpretable models in the style of
scikit-learn<br>
<a href="http://proceedings.mlr.press/v119/rieger20a.html">explanation penalization</a> - regularize
explanations align models with prior knowledge<br>
explanations to align models with prior knowledge<br>
<a href="https://proceedings.neurips.cc/paper/2021/file/acaa23f71f963e96c8847585e71352d6-Paper.pdf">adaptive
wavelet distillation</a> - replace neural nets with simple, performant wavelet models
</div>

<div class="research_box">

<strong>🚗 LLM steering. </strong>Interpretability tools can provide ways to better guide and use LLMs
(collaboration with many folks, particularly <a href="https://jxmo.io">Jack Morris</a>).
<br>
<br>
<a href="https://arxiv.org/abs/2310.14034">tree prompting</a> - improve black-box few-shot text classification
Expand All @@ -62,10 +60,11 @@ <h2 style="text-align: center; margin-top: -150px;"> Research</h2>
explanations of fMRI encoding models
</div>


<div class="research_box"><strong>💊
Healthcare. </strong>I'm also actively working in how we can improve clinical decision instruments by using
the information contained across various sources in the medical literature (in collaboration with many folks
including <a href="https://profiles.ucsf.edu/aaron.kornblith">Aaron Kornblith</a> at UCSF and the MSR <a
the information contained across various sources in the medical literature (in collaboration with <a
href="https://profiles.ucsf.edu/aaron.kornblith">Aaron Kornblith</a> at UCSF and the MSR <a
href="https://www.microsoft.com/en-us/research/group/real-world-evidence/">Health Futures team</a>).
<br>
<br>
Expand Down
5 changes: 5 additions & 0 deletions _notes/neuro/comp_neuro.md
Original file line number Diff line number Diff line change
Expand Up @@ -649,6 +649,9 @@ subtitle: Diverse notes on various topics in computational neuro, data-driven ne
- evaluate TPDN based on how well the decoder applied to the TPDN representation produces the same output as the original RNN
- Discovering the Compositional Structure of Vector Representations with Role Learning Networks ([soulos, mccoy, linzen, & smolensky, 2019](https://arxiv.org/pdf/1910.09113.pdf)) - extend DISCOVER to learned roles with an LSTM
- role vector is regularized to be one-hot
- - Concepts and Compositionality: In Search of the Brain's Language of Thought ([frankland & greene, 2020](https://www.annualreviews.org/doi/10.1146/annurev-psych-122216-011829))
- Fodor’s classic language of thought hypothesis: our minds employ an amodal, language-like system for combining and recombining simple concepts to form more complex thoughts
- combinatorial processes engage a common set of brain regions, typically housed throughout the brain’s default mode network (DMN)


## synaptic plasticity, hebb's rule, and statistical learning
Expand Down Expand Up @@ -1584,6 +1587,8 @@ the operations above allow for encoding many normal data structures into a singl
- aligning with experimental/psychological data
- [How Well Do Unsupervised Learning Algorithms Model Human Real-time and Life-long Learning? | OpenReview](https://openreview.net/forum?id=c0l2YolqD2T) (zhuang...dicarlo, yamins, 2022)
- Biologically-inspired DNNs (not data-driven)
- Relating transformers to models and neural representations of the hippocampal formation ([whittington, warren, & behrens, 2022](https://arxiv.org/abs/2112.04035))
- transformers, when equipped with recurrent position encodings, replicate the pre- cisely tuned spatial representations of the hippocampal formation; most notably place and grid cells
- Emergence of foveal image sampling from learning to attend in visual scenes ([cheung, weiss, & olshausen, 2017](https://arxiv.org/abs/1611.09430)) - using neural attention model, learn a retinal sampling lattice
- can figure out what parts of the input the model focuses on
- [Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations](https://proceedings.neurips.cc/paper/2020/hash/98b17f068d5d9b7668e19fb8ae470841-Abstract.html) (dapello…cox, dicarlo, 2020) - biologically inspired early neural-network layers (gabors etc.) improve robustness of CNNs
Expand Down
17 changes: 17 additions & 0 deletions _notes/research_ovws/ovw_transformers.md
Original file line number Diff line number Diff line change
Expand Up @@ -1066,6 +1066,16 @@ mixture of experts models have become popular because of the need for (1) fast s
- task 1: idea-sentence generation -- given sentences describing background context + a seed term, generate a sentence describing an idea
- task 2: idea-node prediction -- given the background context, predict new links between existing concepts (and generate new concepts)
- forecasting paper titles ([blog post](https://csinva.io/gpt-paper-title-generator/))
- Communication with animals

- [Coller-Dolittle Prize](https://coller-dolittle-24.sites.tau.ac.il) for Inter-species Communication
- Cetacean Translation Initiative: a roadmap to deciphering the communication of sperm whales ([andreas, begus, …, wood, 2021](https://arxiv.org/pdf/2104.08614.pdf))
- sperm whale has largest brain
- ML outputs are primarily a tool to constrain hypothesis space to build formal and interpretable descriptions of the sperm whale communication
- A Theory of Unsupervised Translation Motivated by Understanding Animal Communication ([goldwasser…paradise, 2023](https://arxiv.org/abs/2211.11081))
- Approaching an unknown communication system by latent space exploration and causal inference ([begus, leban, & gero, 2023](https://arxiv.org/abs/2303.10931)) - manipulate GAN latent variables in approach called causal disentanglement with extreme values (CDEV)
- Vowels and Diphthongs in Sperm Whales ([begus, sprous, leban, & gero, 2023](https://osf.io/preprints/osf/285cs)) - use data from the dominica sperm whale project ([gero et al. 2014](https://onlinelibrary.wiley.com/doi/abs/10.1111/mms.12086))

- scientific organization ([galactica](https://galactica.org/static/paper.pdf))
- related but smaller models
- SciBERT ([beltagy...cohan, 2019](https://arxiv.org/abs/1903.10676))
Expand Down Expand Up @@ -1249,6 +1259,13 @@ mixture of experts models have become popular because of the need for (1) fast s
- Evaluating Large Language Models on Medical Evidence Summarization ([tang...peng, 2023](https://pubmed.ncbi.nlm.nih.gov/37162998/)) - score summaries based on 6 dimensions (e.g. coherence)
- Summarizing, Simplifying, and Synthesizing Medical Evidence Using GPT-3 (with Varying Success) ([shaib...wallace, 2023](https://arxiv.org/abs/2305.06299))
- SummIt: Iterative Text Summarization via ChatGPT ([zhang, ..., zhang, 2023](https://arxiv.org/abs/2305.14835))
- TRIALSCOPE: A Unifying Causal Framework for Scaling Real-World Evidence Generation with Biomedical Language Models ([gonzalez, wong, gero, …, poon, 2023](https://arxiv.org/pdf/2311.01301.pdf))
- extract attributes from structured & unstructured EHR to form basis for clinical trial specification / experiments
- Scaling Clinical Trial Matching Using Large Language Models: A Case Study in Oncology ([wong, zhang, …, poon, 2023](https://proceedings.mlr.press/v219/wong23a.html))
- LLMs can structure eligibility criteria of clinical trials and extract complex matching logic (e.g., nested AND/OR/NOT)
- BiomedJourney: Counterfactual Biomedical Image Generation by Instruction-Learning from Multimodal Patient Journeys ([gu, yang, usuyama, …, gao, poon, 2023](https://arxiv.org/abs/2310.10765))
- counterfactual biomedical image generation by instruction-learning from multimodal patient journeys
- specifically, learn from triplets (prior image, progression description, new image), where GPT-4 generates progression description based on the image notes



Expand Down
2 changes: 1 addition & 1 deletion assets/css/chandan.css
Original file line number Diff line number Diff line change
Expand Up @@ -82,4 +82,4 @@ td {
.bullets {
display: inline-block;
align-items: center;
}
}

0 comments on commit 05816cd

Please sign in to comment.