A Catalogue of AI Research Idea Generators

Delip Rao,

Sep 4, 2024

I am mentoring a bunch of early-career students (senior undergrad and masters) on how to do research. In class, we brainstormed the question, “Where do ideas come from?”. Common answers for that included:

Reading other papers
Listening to talks
Scrolling Twitter
Looking at GitHub
Talking to experts and friends
Science Fiction
and even… the bathroom

But have you ever wondered how AI researchers come up with ideas? Do they wait for a bolt of lightning to strike them, or can they be methodical about it? Is there a set of recipes that we can follow to generate new ideas? I am offering a preliminary catalog of AI research idea generators distilled from at least a thousand AI research papers I have read, skimmed, or glanced at during my career. I have not seen this anywhere else. Each of these is a tried and tested formula/recipe for generating AI research ideas:

Invent a hammer and find nails to whack: This is straightforward and the most common approach that newcomers to research think of when they think about what “doing research is”. Usually, this involves finding a technique or proposing an architecture, and applying it to multiple datasets to show improvements. Examples from the recent past include the transformer paper and the chain-of-thought paper.
Find hammers to whack on your nail: Here you invent a task and provide a rationale for why it should be solved and how it is useful for society. Then you painstakingly build a small but high-quality dataset for it, try out different models and techniques on that dataset, and declare a winner. Many new tasks proposed at ACL tend to be like this. Example: See the FEVER paper.
Create a horse race: This is a scaled-up version of #2. Typically, you try out recipe #2 to pilot a task and gauge community interest. If it works, you create a “shared task” (think of it as the OG Kaggle competitions) with a much larger dataset and invite the community to participate. In NLP, TREC, CoNLL, WMT, and Semeval competitions are prototypical examples of these. An example from computer vision is ImageNet. It is very common for some of these tasks to have a multi-year half-life, but most saturate quickly due to the progress in hammer production. In the post-LLM era, “benchmark papers” are also examples of these horse races. People love horse races, but they also get bored of them soon.
Climb that hill: This recipe is for the engineers among us. Take an existing idea and make it faster, cheaper, smaller, or better. For example, the original Transformer paper introduced attention computation with a quadratic cost. Then came a slew of papers that reduced the time complexity of the attention algorithm to subquadratic regimes.
Play matchmaker: Take something you know well from your community and apply it in another community. Before neural networks were a thing, we would routinely see physicists take ideas from statistical mechanics, say sampling algorithms, and introduce them to the ML community (applying sampling algorithms for parameter estimation in graphical models). These papers tend to be amazing. More than a decade ago, way before AlphaFold existed, I attended a talk by the awesome Julia Hockenmaier who blew my mind by showing how NLP parsing techniques could be applied to protein folding. In the post-LLM era, invoking this recipe is quite common.
Introduce constraints: Humans, when put in constrained situations, become highly creative. You can use this to create new recipes by taking a well-known problem setting and adding a constraint. Suddenly you have a harder version of the problem demanding novel solutions. Examples of constraints include solving a problem with a tiny training dataset (see papers on low-resource NLP) and making models run on small devices (see work by Tim Dettmers).
Expand the scope: Here you take a well-known task and expand its scope. Take something monolingual and multilingual. Take something unimodal and make it multimodal. Make it general (limited vocabulary speech recognition vs. open vocabulary speech recognition).
Mind the gap: Often when you are invested in a problem, you end up reading most papers on that problem, and in the process, you might realize there is a blind spot or a “gap” in the literature. These gaps arise for several reasons. Perhaps authors of the previous works made unstated assumptions about something. Maybe an important initial/boundary condition was not examined? In the ACL anthology alone, searching for the phrase “gap in the literature” results in more than 1000 hits. If, for example, all parsing papers you looked at were for left-to-right languages, then clearly the community has a gap in the parsing literature for Semitic languages. That’s your new idea!
Court the outliers: When empirical results are reported, outliers are usually brushed off or Huberized. This is because, during evaluation, we are interested in measuring how good our model of the world is and we want those measurements to be reliable (smooth). Complementary to evaluation is the dreaded and often missed step — by novice practitioners — of error analysis, where you are required to categorize prediction/generation mistakes and explain them. Some of the “mistakes” tell a completely different story. For certain stop sign images in the ImageNet dataset, misclassifications occur despite being “easy” and clearly visible. It turns out some of them had graffiti on them that confused the ConvNets. Instead of discarding these as bad image inputs, you could look at these from a different perspective: What if stop signs with graffiti are adversarial inputs for image classifiers? Can I put a post-it labeled “chair” on a desk and confuse the classifier? You now have the adversarial attacks paper!
Customer-centric work: We rarely see these kinds of ideas in academic conferences, but this happens a lot in product building in startups/industry. You build a product and release it into the wild. Observe how your users are struggling and identify aspects of research that could come out of this. In academia, this recipe is more common in HCI than AI.
Become an old wine sommelier: Read a lot of old papers and repackage them as new papers. This might appear like a grift, but it might be the most valuable thing you could do for the community. A lot of old ideas are forgotten or abandoned because the people who championed them no longer exist in the field, the ideas did not work well with the hardware of that time, and so on. Ideas that get buried are not necessarily stale. You have a unique opportunity to resuscitate these ideas, and who knows, you might start something like the deep learning revolution!