Over the years I have developed some habits as I work that help to make me more efficient as a researcher. Of students that I’ve seen struggle in their doctoral studies, they are always lacking several of the habits in this list. Of students who excel, they have all (or nearly all of these habits) either because they were mentored in them, or somehow figured it out themselves.
In assigned homework, students will be expected to conform to good coding and plotting practices, and to submit an annotated bibliography in bibtex format if literature searches are required.
Many of these habits overlap with the list of good work habits in jobs in the private sector. Start using these practices now, and reap many benefits as you go along
- Back up your files
- Sharing information in the cloud
- Organize your work
- Use good coding practices
- Make descriptive plots
- Motive and Objective
Motive and Objective
MOTIVE AND OBJECTIVE
- Background reading, and documenting the published literature on a topic
- Publish your work in a timely fashion
Regularly back up your files
Back up your computer at least once a week to an external hard-drive (trust me on this one… if you don’t back things up regularly, there WILL come a day when you will lose several months worth of work).
Return to top
Sharing information in the cloud
If you are sharing files in the cloud with other research collaborators (for instance, like a folder in Dropbox), don’t delete the files without getting the OK from everyone first!
Organize your work
Make a separate directory for each new research project
- Create a README file in that directory, with notes, short description of what various R/Matlab/etc scripts in that directory do, links to web pages with useful information, etc, etc, etc.
- Here’s a directory listing of an analysis we did, which was recently published in PLoS ONE, examining how news media can cause people to be concerned about things like Ebola. Note that I gave the directory a descriptive name. Also note that the README file is right at the top of the list.
laptop:plos_panic stowers$ ls README backup code code.txt data_for_analysis.csv data_for_analysis.txt ebola_data_new.rtf ebola_results.R ebola_twitter_complete.csv figure2.pdf figure3.pdf fit_results.txt fit_results_switch.txt model_fit_example.R model_fit_example_utils.R news.txt plos2009.bst plos_ebola_plot_data.R plos_template.tex plot_data.R plot_fit_results.R preproc.R preprocessed_ebola_related_data.csv preprocessed_ebola_related_data_normalized_nov2.txt preprocessed_ebola_related_data_normalized_oct31.txt preprocessed_ebola_related_data_oct29.csv preprocessed_ebola_related_data_to_oct_25.csv report_do_i_have_ebola_nov2.csv report_ebola_nov2.csv report_ebola_symptoms_nov2.csv report_signs_of_ebola_nov2.csv report_symptoms_of_ebola_nov2.csv results_fits.pdf social_media_data_fit.R submitted_plos tex this_works tweets_with_news_removed.txt twit.R twit.rtf twit_new.R wiki.txt
Use good coding practices
Comment your code! Comment your code! Comment your code!!! And use good coding practices! Any code submitted in homework assignments must be properly commented, and follow good coding practices.
- When you submit a paper for publication, it can be weeks or months before you get reviewer comments back, which will require responses that pretty much always involve having to redo some part of the analysis. It can be disastrous if you can’t figure out what your old Matlab/R/python/C++/etc programs did! You need to be able to pick up where you left off seamlessly. Part of that involves the README file, and another part involves thoroughly commenting your code so that you can, at a glance, know what it was doing. Also, as you move further along in academia, the likelihood that you will be working on more than one project rises. Sometimes that means you have to lay a project to the side for a few weeks or months, then return to it.
- There will come a time when you will have to share your code with collaborators. They need to be able to read it and determine what it did.
Use good practices in making figures
Make informative, and properly labelled, plots. All plots submitted with homework assignments need to be properly labelled and follow good plotting practices.
- Plots in your papers tell a large part of the story, and before reading the paper in-depth, many reviewers will first read the Abstract and Introduction, and then flip to the plots. At a minimum, plots should have all axes labelled, and should have a descriptive caption. Tables should also have descriptive captions. Here is an example of a properly labelled plot, with a descriptive caption.
Motivation and Objective: you need both, and they don’t mean the same thing!
Don’t start any research project without a clear idea of both the motivation and the objective. Students who struggle are the ones most likely to confuse objective and motivation (and focus just on objective, and mistakenly believe it is the motivation). Without (at a minimum) both motivation and objective, you don’t have an interesting and novel research topic.
- A quick example of motivation and objective: Thousands of people in America die each year from influenza, and greater understanding is needed of the underlying dynamics of the spread and control of the disease (that’s the motive… the problem that needs to be solved that someone on earth actually cares about. I should also state in my motivation what background work has been done in the published literature to address the problem, and how that work is lacking in some way). We will perform a modeling analysis where we will examine the impact on morbidity and mortality of prophylactic antiviral therapy directed at high risk groups (that’s the objective… the modeling work we’re going to do. I should also state why the objective is novel and useful to the problem at hand).
- I could do a nice modeling analysis with the objective of studying the effect of prophylactic antiviral therapy for influenza in squirrels (same objective as above, really). But there is no motivation for it; squirrels aren’t having massive die-offs due to flu, and they don’t spread it to humans or other animals. No one cares about you, squirrel flu.
Read. Read. And read some more. And document your reading in an annotated bibliography
The importance of voraciously reading published literature towards your success as a researcher cannot be understated. This paper on Ten Simple Rules for Getting Published has extensive reading of the background literature as Rule #1. Good research always starts with the background reading, not saving that for the last step. It is true that when you’re first starting out in graduate school you have hardly any background in your research area of interest, and may have trouble understanding parts of papers (sometimes large parts). But it will get easier in time, especially if you share those papers with your faculty mentors and ask them for help in interpreting the analyses.
You’ll notice that background reading figures prominently in the pantheon of some of the “Ten simple rules…” papers, such as “Ten Simple Rules for Effective Computational Research“, “Ten Simple Rules for Responsible Referencing“, and last but certainly not least, “Ten simple rules for developing good reading habits during graduate school and beyond“. In the latter, it talks about the importance of daily reading. I usually read on average one to a few papers per day in the course of my work, but when I am embarking on a new research topic I ramp it up and will read many per day, and I often will dedicate a couple of hours each evening during that period towards doing that. And while doing that reading, I constantly document what I’ve read in an annotated bibtex bibliography.
Which leads me to…
Create an annotated bibliography as you’re doing background reading. Start reading right from the conception of the project. Read continually throughout the project and add to the bibliography. Some homework assignments will require submission of a properly annotated bibliography in bibtex format.
There’s more to research than just doing an analysis; equally important is publishing it!
As researchers, it is important to realize that we aren’t paid just to do studies… we need to publish our work in a timely fashion! Always write your manuscript as you go: right after conceiving of the research question and doing some background reading, you should be very clear on what your motivation and objective are, and what relevant work on the subject has been published already. At this point, start a latex document, and put in the section labels for the Introduction, Methods and Materials (and under that the sub-headings Data, and Model), Results and Discussion, and Summary.
- You should always begin by writing your Introduction very early on! It helps you to clearly state the motivation, give an overview of the background literature, and then state the objective of your paper. You can, and should, always do this before embarking on the analysis! If you can’t do it, it means you haven’t properly formed your research question and/or you have not done sufficient background reading to know if your planned objective is even novel.
- If your analysis involves data, the Methods and Materials:Data section is easy to write. Just thoroughly describe where and/or how you got your data. Include URL’s for online publically accessible data.
- The Methods and Materials:Model section is also easy to write. From your objective, you know what you are going to model and how. Include references to related methods in the literature.
- In the Results section, put some bullet-point sentences that roughly describe the figures and tables you will be making (and theorems you might be proving) to show that you have achieved the objective of the analysis. As you go along in the analysis, start filling in the text.