I study artificial intelligence through language understanding and machine learning. I am fortunate to be advised by Jacob Andreas. Here is a summary of my questions and findings:
Human-like generalization & compositionality How can we make neural models generalize in language like humans do, by handling complex combinations of words and meanings?
I developed [1, 2, 3] simple and scalable strategies that will enable neural sequence models (Transformers and RNNs) to achieve types of generalization that humans and explicitly compositional models do.
I found that a part of compositionality can be expressed as a certain symmetry on the distribution of the language, lexical homomorphism, and formalized it in our paper.
Uncovering neural intelligence To what extent are the functionalities learned by large neural models similar to those of humans?
In my studies, I conjectured that large language model’s in-context learning abilitiy can be seen as a classical learning algorithm running in the hidden states of a model. I proved that a small enough Transformer can achieve in-context learning by simulating known algorithms, such as least squares, by constructing such a Transformer model. Our constructions and theory have contributed to a line of research sometimes referred as in-context learning theory.
I worked on identifying which training examples taught a language model to generate a particular factual assertion through influence functions, where I developed controlled benchmarks that show that such methods do work in synthetic setups but not in real language models due to gradient saturation.
Building interactive and embodied agents How can we harness generative models (e.g. large language models) to build interactive agents that can work in multi-modal envrionments (e.g. minecraft, juypter notebooks)?
I contributed projects on (i) using language to guide image classifiers to learn representations that enable learning of new classes (only with few samples) without forgetting the old ones, and (ii) using language models to guide policy learning in a virtual home environment.
Currently, I am working on building multi-modal agents that can follow language instructions in Minecraft, and agents that can mimic data scientists on jupyter notebooks.
Our compositionality work received an area chair award at ACL 2023. Our in-context learning work was highlighted in MIT News, Vice, and The Gradient. I am gratefully supported by the Amazon Alexa Fellowship, administered by MIT-Amazon ScienceHub.
I was born in the small city of Soma in Manisa, Türkiye. I attended Izmir Science High School, where my genuine passion for science started. During my high school years, I was selected to represent Türkiye at the International Physics Olympiads, where I won a bronze medal. I earned Bachelor’s degrees in Electrical & Electronics Engineering and Physics from Koç University, where I actively contributed to the KUIS AI Lab under the mentorship of Prof. Deniz Yuret.
In my final year of undergraduate studies, I was a visiting student at MIT CSAIL, collaborating with Prof. Alan Edelman on a novel linear algebraic approach to backpropagation, and work with John Fisher on efficient algorithms for bayesian non-parametrics. Subsequently, I began my PhD journey, working with Jacob Andreas. During my PhD, I completed two internships at Google Research, first, collaborating with Kelvin Guu and Keith Hall on exploring fact attribution for large language models using influence functions. Following this, I interned at Google Brain Team (now Google-Deepmind), guided by Denny Zhou, to advance our understanding of in-context learning.
I am maried to my lovely wife Afra Feyza Akyürek, and we currently live in Boston, MA.
I believe I am an outdoor and summer person, I like swimming, biking, sailing and hiking whenever Boston weather allows. My wife and I enjoy traveling together to discover new places around the world. I do play guitar to chill. I do like cooking and learning new recipes (e.g. I can make a Texas style brisket at home).
|Jul, 2023||Our paper LexSym: Compositionality as Lexical Symmetry has won lexical semantics area chair award at ACL2023.|
|Jul, 2023||I gave a talk about our paper What learning algorithm is in-context learning? Investigations with linear models on in-context learning to Naval Warfare Center researchers.|
|Jun, 2023||I gave a talk about our paper What learning algorithm is in-context learning? Investigations with linear models at MIT Mechanistic Interpretability Conference.|
|May, 2023||I attended the meeting of Philosophy of AI Reading Group at Oxford to discuss our paper What learning algorithm is in-context learning? Investigations with linear models, hosted by Raphaël Millière.|
|May, 2023||At Google NLP Reading Group, I presented our paper What learning algorithm is in-context learning? Investigations with linear models, hosted by Peter Chen.|
|Dec, 2022||In Munich NLP community meeting, I presented our What learning algorithm is in-context learning? Investigations with linear models, hosted by Muhtasham Oblokulov.|
|Nov, 2022||At KUIS AI, I presented our paper What learning algorithm is in-context learning? Investigations with linear models, hosted by Gözde Gül Şahin.|
|Jul, 2022||I gave a talk about our paper LexSym: Compositionality as Lexical Symmetry at EML Tubingen, hosted by Zeynep Akata.|
|Oct, 2021||I presented our paper Lexicon Learning for Few-Shot Neural Sequence Modeling at IBM Research, hosted by Yang Zhang.|
|Sep, 2021||I presented our paper Lexicon Learning for Few-Shot Neural Sequence Modeling at Boston University AIR Seminar.|