I find it beneficial to find someone in my tribe who is just a little bit further along in their journey than I am. These tribe members act like pathfinders by looking back on their own path and, with a certain amount of introspection, let me know about pitfalls or advice that can help me on my journey.
I have gathered a virtual group of recently graduated (graduate) students to act as our pathfinders. In this series of posts, you will hear their thoughts on questions that either graduate students have asked me or I have even wondered myself.
How can you participate in the conversation?
- Post your thoughts in the comments. If you’ve recently graduated, make sure to include that in your response.
- Email me directly.
- Share this post using the social media buttons to expand its reach.
Today’s pathfinder topic is literature searches. It seems like a straight forward process. I’m not certain how one executed a good lit search before the internet. My early days of lit searching involved heavy use of Web of Science, a loaded copy card and hours in the stacks at the library. I still use Web of Science for searches when I don’t know what literature exists, but Google Scholar is my go-to if I know precisely what I’m looking for.
I feel most current literature searches happen in the privacy of our desk spaces (coffee shops?) with a laptop or tablet, and I wanted to have our pathfinders shed light on their process.
With the hope of demystifying the dark art of a good literature search, I asked our pathfinders:
What is your best version/process for a literature search?
Start from the highly cited works.
I don’t consider this to be one of my strong suits, honestly. But what I usually do is search on google/university library website for keywords and then follow the thread of references in papers I find interesting and relevant.
A story to share: I was working in a narrow and new area. To finish one paper, I need to check how others set up the Boundary conditions for the Poisson equation. I used Google Scholar to find the 20 major publications in this area first.
Then, starting from these 20 papers, I looked up every single paper that cited these 20 papers, and every single reference entry in these 20 papers. It turned out that I read around 300 abstracts. for each paper I read, I used “Control + F” to look up some keywords such as “domain”, “boundary”, “Neumann”, “Dirichlet”, etc. After that, I was confident to speak at the conferences and talk to big names about this field. It was 3 days before Christmas in 2015, totally worth it.
Read the review paper of the whole field first and then gradually narrow down to the papers in the smaller subfields.
I use PubMed, Web of Science, and Google Scholar, using keywords and citation counts to find the most influential papers on a topic, and then looking at the papers which cite the influential papers, as well as the papers the influential papers cite. I also use citation alerts to find out when new papers are published which cite the influential papers. Research should always be considered in the context of previous work, so good papers will cite other good papers.
I normally begin with a research article that is widely cited, published in a prestigious journal or includes an extensive review of the field. I would read this article carefully and try to get a good understanding of the topic. Then I would search for other research articles that cite or are cited by the previously mentioned article. Google scholar is very helpful for tracking the citation record of each paper.
I try to keep my eye out for relevant papers posted to Stat/ML arxiv.org section using Twitter. There is also a rich ML online research community that helps filter through the deluge. By curating the people you follow, you can get a sense of what’s going on in your area of research pretty easily. This process matched with arxiv-sanity.com and Google Scholar has worked really well for me. You can quickly find related papers, click through to papers that are cited by a paper, etc.
I like to get my arms around the whole topic. I typically start with one paper, grab all of its references, and then grab all of those, etc. This is not the most time-efficient way but helps me synthesize and know the material better than most other people.
There doesn’t seem to be “one true way” to do a literature search. It needs to be tailored to your purpose. Did your advisor tell you to go find a topic? Are you about to give a presentation and have a heavy case of imposter syndrome going on? Or do you want to just stay up on what’s new?
Finding a topic
The consensus is to start with the gold-standard in the area you’re looking at: a review paper and highly-cited papers, and prep yourself for the “deep dive on a topic”. Get a sense of the scope of what’s been done, what questions haven’t been answered, and what.
How do I recommend you read these papers: AIC: abstract, introduction, conclusions. Make notes on the paper and move on to the next one.
Battling imposter syndrome at a pivotal moment
Our anonymous Ph.D. gave a detailed view of how they conquered what I have labeled imposter syndrome before a big conference. It involved a few days of taking the deep dive on the singular idea that was keeping them up at night. The outcome was that the imposter syndrome got put in a corner for that pivotal moment. Our anonymous Ph.D. felt confident to step into that space and own their success.
Imposter syndrome is a real thing and I am conflicted about this solution. My advice here is to do what you need to do to feel confident in pivotal moments. I do not recommend this strategy for day-to-day survival. There are other, more effective ways to deal with imposter syndrome.
Staying up with what’s new
I will be the first to admit this is not my strength. Despite how hard it feels to get work published nowadays, there is a glut of papers published daily. I have found Google Alerts to be helpful and appreciate Taylor’s approach of curating people and accounts you follow on social media to help streamline valuable information.
Something I didn’t appreciate early on was what I did with the papers once I returned from the library stacks and how I kept it organized. This came back to biting me on the ass when it came time to write my thesis. (apologies for the fresh language but there is no other way to explain the experience of sitting in a grad student desk chair while 7 months pregnant trying to wrangle the references into a bibliography)
I’ve listed a few here that consistently bubble up to the top of the conversation (which I have no affiliation with):
At the end of the thesis, I promise you whichever one you pick, you won’t regret it. As long as you pick one.
Shahrouz Mohaghe earned a Ph.D. from Oklahoma State
Francesca Bernardi is a Dean’s Post-Doctoral Scholar in the Department of Mathematics at Florida State University. Her research focuses on wastewater filtering and porous media. She received a Ph.D. in Mathematics and a Graduate Certificate in Women’s and Gender Studies from the University of North Carolina at Chapel Hill in 2018. She is the co-founder of Girls Talk Math – a free day camp for female and gender non-conforming high school students interested in Mathematics and Media. She is part of the Leadership Board at 500 Women Scientists – a non-profit grassroots organization aimed at making science more open, inclusive, and accessible. @fra_berni
Kimberly Stevens graduated with a Ph.D. in Mechanical Engineering from Brigham Young University in December of 2018. She is now a Lillian Gilbreth Postdoctoral Fellow in Mechanical and Biomedical Engineering at Purdue University.
Taylor Killian graduated from Harvard with a Masters in Computational Science in 2017 and worked for at MIT Lincoln Laboratory for two years as part of his fellowship program. He is now in a Ph.D. program at the University of Toronto where he’ll work at the intersection of Machine Learning and Healthcare. @tw_killian Linkedin Github
Chris Cloney graduated in April 2018 with a