When you preregister your research, you're simply specifying your research plan in advance of your study and submitting it to a registry.
Preregistration separates hypothesis-generating (exploratory) from hypothesis-testing (confirmatory) research. Both are important. But the same data cannot be used to generate and test a hypothesis, which can happen unintentionally and reduce the credibility of your results. Addressing this problem through planning improves the quality and transparency of your research. This helps you clearly report your study and helps others who may wish to build on it. For instructions on how to submit a preregistration on OSF, please visit our help guides.
For additional insight and context, you can read The Preregistration Revolution. (preprint)
The Global Flourishing Study (GFS) is partnering with the Center for Open Science (COS) to make study data an open access resource so researchers, journalists, policymakers, and educators worldwide can probe detailed information about what makes for a flourishing life. Anyone can access the data immediately upon release for specific analysis purposes by submitting a preregistration of the proposed research to COS.
It may be difficult to fully prespecify your model until you have a chance to explore through a real data-set. This could help you test model assumptions and make reasonable decisions about how the model should be structured. However, the result of that work is a specific, testable hypothesis. By randomly splitting off some "real" data, you can build the model through exploration and then confirm it with the portion of the data that has not yet been analyzed. Though this process reduces the sample size available for confirmatory analysis, the benefit gained through increased credibility (not to mention an iron-clad rationale for using 1-tailed tests!) more than makes up for it.
Yes. The central aims of preregistration are to distinguish confirmatory and exploratory analyses in order to retain the validity of their statistical inferences. Selective reporting of planned analyses is problematic for the latter.
Yes. Selective interpretation of pre-planned analyses can disrupt the diagnosticity of statistical inferences. For example, imagine that you planned 100 tests in your preregistration, and then reported all 100, 5 of which achieved p < .05. It is possible (even likely) that those five significant results are false positives. If the paper then discussed just those five and ignored the others, the interpretation could be highly misleading. Planning in advance is necessary but not sufficient for preserving diagnosticity.
To reduce interpretation biases, confirmatory research designs often have a small number of tests focused on the key questions in the research design, or adjustments for multiple-tests are included in the analysis plan. It may be that some preregistered analyses are dismissed as inappropriate or ill-conceived in retrospect, but doing that explicitly and transparently assists the reader in evaluating the rest of the confirmatory results.
No. Preregistration distinguishes confirmatory and exploratory analyses (Chambers et. al, 2014). Exploratory analysis is very important for discovery and hypothesis generation. Simultaneously, results from exploratory analyses are more tentative, p-values are less diagnostic, and additional data is required to subject an exploratory result to a confirmatory test. Making the distinction between exploratory and confirmatory analysis more transparent increases credibility of reports and helps the reader to fairly evaluate the evidence presented (Wagenmakers et al., 2012).
Exploratory and confirmatory research are both crucial to the process of science. In exploratory work, the researcher is looking for potential relationships within a dataset, effects of a candidate drug, or differences between two groups. The researcher wants to minimize the chance of making a Type II error, or a false negative, because finding something new and unexpected could be an important new discovery.
In confirmatory work, the researcher is rigorously testing a predicted effect. The specific hypothesis is very clear, and she has specified one way to test that hypothesis. The goal of confirmatory research is to minimize the Type I error rate, or false positives.
The purpose of preregistration is to make sure the distinction between these two processes are very clear. Once a researcher begins to slightly change the way to test the hypothesis, the work should be considered exploratory.
At least one confirmatory test must be specified in each preregistration.
Perhaps. A goal of pre-analysis plans is to avoid analysis decisions that are contingent on observed results (except when those contingencies are specified in advance, see above). This is more challenging for existing data, particularly when outcomes of the data have been observed or reported. Standards for effective preregistration using existing data do not yet exist.
When you create your research plan, you will identify whether existing data is included in your planned analysis. For some circumstances, you will describe the steps that will ensure that the data or reported outcomes do not influence the analytical decisions. Below are the categories for which preregistration may still use existing data.
Split incoming data into two parts: One for exploration and finding unexpected trends or differences. Preregister tantalizing findings. Confirm with the other data set that had been held off. “Model training” and “validation” are other terms for this process. Below are three papers that describe this process in more detail:
If your preregistration on the OSF is less than 48 hours old and has not yet been confirmed by its contributors, you can cancel it (see here for details).
If changes occur in your project after the registration is finalized, you have two options:
Option 1: Create a new preregistration with the updated information. After creating that preregistration, make a note of its URL and withdraw your original preregistration. In the withdrawal process, make a short note to explain the rationale for removing this registration and include the URL for the newly registered project.
Choose option 1 if you have made a serious error in your preregistration (such as accidentally including sensitive information that should not be shared) or if you have not yet started data collection.
Option 2: Start a Transparent Changes document now. Upload this document to the OSF project from which you started your registration and refer to it when reporting the results of your preregistered work.
Choose option 2 if you have already begun the study. It is expected that most preregistered studies will have some changes, so do not feel that this diminishes your study in any way, after all, your preregistration is a plan, not a prison.
Registered Reports are a particular publication format in which the preregistered plan undergoes peer review in advance of observing the research outcomes. However, in the case of Registered Reports, that review is about the substance of the research and is overseen by journal editors. Research designs that pass peer review are offered ‘in principle acceptance’ (IPA) ensuring that the results are guaranteed to be published regardless of findings, as long as the methodology is carried out as described.
After being granted IPA by a journal, you should ensure that that research plan is preserved. The journal may have a mechanism to do that, or you may use this workflow to register your accepted plan: https://osf.io/rr
No. Confirmatory analyses are planned in advance, but they can be conditional. A pre-analysis plan might specify preconditions for certain analysis strategies and what alternative analysis will be performed if those conditions are not met. For example, if an analysis strategy requires data for a variable to be normally distributed, the analysis plan can specify evaluating normality and an alternate non-parametric test to be conducted if the normality assumption is violated.
For conditional analyses, we suggest that you define a 'decision-tree' containing logical IF-THEN rules that specify the analyses that will be used in specific situations. Here are some example decision trees. In the event that you need to conduct an unplanned analysis, preregistration does not prevent you from doing so. Preregistration simply makes clear which analyses were planned and which were not.
There are several research circumstances that present challenges to conducting preregistered research.
If the present preregistration process does not fit your research approach effectively, and you believe that there are ways to conduct preregistered research in your field, we encourage you to contact us to help develop and specify a preregistration process for your work (email@example.com).
When you have many planned studies being conducted from a single round of data collection, you need to balance two needs: 1) creating a clear and concise connection from your final paper to the preregistered plan and 2) ensuring that the complete context of the conducted study is accurately reported.
Imagine a large study with dozens of analyses, some of which will be statistically significant by chance alone. A future reader needs to be able to obtain all of the results in order to understand the complete context of the presented evidence. With foresight, some of this challenge in minimized. Parsing one large data collection effort into different component parts may reduce the need to connect one part of the work to another, if the decision to make that distinction is made ahead of time in a data-independent manner.
The easiest way to organize such a complex project on the OSF is with components. These sub-projects can contain your individual analysis plans for different aspects of your larger study.
Finally, as is true with most recommendations, transparency in key. Disclose that individual papers are part of a larger study so that the community can understand the complete context of your work.
You may embargo your preregistration plan for up to 4 years to keep the details from public view. All registrations eventually become public because that is part of the purpose of a registry - to reduce the file-drawer effect (sometimes called the grey literature). Information about embargo periods is here. It is possible to withdraw your preregistration, but a notification of the withdrawal will be public. You may end an embargo early, see here for instructions.
Maybe, but there are several pitfalls to be aware of. First is the fact that a fourthcoming round of data collection is likely to be highly correlated to the previous round of data collection. If an individual was notable for one characteristic last year, they are likely to still be notable on that (or related) traits. However, there are a few ways that preregistration can still be used to perform purely confirmatory analyses on forthcoming data.
In some cases, preregistration may not be possible. If you know the cohort well, then your ability to conduct confirmatory or inferential analyses on that population may be minimal. This does not diminish the value of the work, as exploratory work is essential for making discoveries and new hypotheses, but should not be presented using the tools designed for confirmation. Preregistering future cohort studies, reserving some of the data in a hold-out confirmatory set, and encouraging direct replications is oftentimes the best answer, despite the investments required.
Preregistration is relatively new to many people, so you may get questions from reviewers or editors during the review process. Below are some possible issues you may encounter and suggested strategies.
Possible editorial or reviewer feedback: Reviewers or editors may request that you remove an experiment, study, analysis, variable, or design feature because the results are null results or marginal.
The issue: All preregistered analysis plans must be reported. Selective reporting undermines diagnosticity of reported statistical inferences.
Possible response to the editor: The results of these tests are included because they stem from prespecified analyses in order to conduct a confirmatory test. Removing these results because of their non-significance would perpetuate publication bias already present in the literature (Chambers et al., 2014; Simmons et al., 2011; Wagenmakers et al., 2012).
Notes: If the reviewer/editor proposes a reason why they believe the null result could be explained by a design flaw, it can often be helpful/appropriate to leave the test in, but discuss the reviewers concerns about the validity of that particular test/design feature in a discussion section.
Possible editorial or reviewer feedback: Why are you referring to a preregistered plan and reporting them separately from other analyses?
The issue: The published article must make clear which analyses were part of the confirmatory design (usually distinguished in the results section with confirmatory and exploratory results sections), and there must be a URL to the preregistration on the OSF.
Possible response to the editor: The registration was certified prior to the start of data analysis. This defines analyses that were prespecified and confirmatory versus those which were not prespecified and therefore exploratory. Clarifying this allows readers to see that the hypotheses, analyses, and design that were prespecified have been accurately and fully reported (Jaeger & Halliday, 1998; Kerr, 1998, Thomas & Peterson, 2012).
Possible editorial feedback: Editor requests that you perform additional tests.
The issue: Additional tests are fine, they just need to be distinguished clearly from the confirmatory tests.
Possible response to the editor: Yes, these additional analyses are informative. We made sure to distinguish them from our preregistered analysis plan that is the most robust to alpha inflation. These analyses provide additional information for learning from our data.
If your project has a single data-collection effort, and if the 3 projects do not depend on one another (ie they could be conducted in parallel and they are not sequential), then a single preregistration might be best, as long as you note in that preregistration that the results will be reported separately (you want to avoid the impression that the first paper coming out is only reporting a biased subset of the analyses- if you prespecify how results are reported then it is a clear justification for this "selective reporting" which is problematic only if it is informed by unexpected trends in the dataset).
If your data collection efforts will be distinct or separate from one another (either in time or in methodology and organization), then multiple preregistrations will likely make the most sense.
If the studies include exploratory that work is designed to inform latter confirmatory studies, then definitely wait to preregister until the exploratory work is completed. Make sure not to analyze any specific data as part of the exploratory stage that will also be used for the confirmatory work. If your design requires that a single data collection effort be used for both exploration and confirmation studies, then you can randomly hold out a portion of the data and use part of it for exploration before opening up the reserved portion for confirmation (see "Hold out data-sets or split samples" above).
If you've never preregistered before, go to osf.io/prereg to get started. If you need help, please see our support pages and help guides.
Oftentimes, the authors on the registration and the final publication do not match. This is usually due to the final article containing both preregistered and unregistered experiments, which is fine as long as the two are clearly labeled. We encourage you to leave authors on the preregistration if they are contributing to that preregistered work because they deserve that credit. It's okay for the author lists to not match perfectly as long as it's clear who did what and proper attribution is given.
Check out the resources available at cos.io/prereg. We also recommend this workshop from APS, where participants were asked to identify “holes” in preregistrations and fill them in with more specific criteria https://osf.io/4acje/. We also recommend on that page: the checklists for complete analysis plans and complete reporting; the PowerPoint slides and recordings; and the Prereg Revolution.
We encourage authors to use their registration verbatim, and to cite their preregistration for clarity and discoverability. Use of quotations or changed tense from the future to the past can address self-plagiarism concerns. We encourage you to use similar language from prereg to final article because it keeps it consistent and concise for the reader.
The level of detail should be enough for an interested reader to be able to replicate the methods of the original study. We encourage you to take the perspective of your future audience: what would you want to know about the study methodology and analyses to enable you to better replicate or extend that research? We also encourage concise language, as the longer the preregistration is, the less likely it will be read in its entirety (though some length is unavoidable).
If the variables will not be used in testing the preregistered hypotheses, then you do not need to include them in the preregistration. It can sometimes be helpful to include them if you think the variables will be used in an exploratory or data-driven way, but it is not required. At a minimum, the variables used in testing preregistered hypotheses must be defined in the preregistration, and any additional variables could be included if you believe their inclusion will add clarity to the work.
It can be a bit tricky when using existing data, but it can still be useful and beneficial. With existing data, it is impossible for the reader to know how much you had known prior to creating the preregistration. If you know the data intimately and understand how the data are going to distribute, then the preregistration is very diminished in its power to mitigate bias. Preregistering what you know about that data helps the reader better assess what you knew before you began the project. This is the best you can do in certain situations. It's transparent what you knew prior to creating the registration, and then it's up to the reader and the community to assess how much, if any, bias may have crept in.
Yes! Sharing your preregistration with the reviewers allows it to be used in the review process. As for anonymity in the review process, you can submit anonymized view-only links. NOTE: Be sure that any attached files or answers to registration questions do not contain any identifying information (including file titles!). The anonymized link removes the Author section of the form, but it cannot redact any information in a file.
Updates or amendments to a preregistration are permissible (in most cases) prior to analyzing the data (up until the outcomes of the study are known). For instance, if you have preregistered an analysis plan, but learn of a better technique before you have analyzed the data, then it is still okay to update your registration since you are not aware of the results of those initial analyses. What is not okay is updating the registration after results of the initial analyses are known to shift the analyses, as it starts to enter the territory of mining for statistical significance. In this case, you are encouraged to still run those analyses, but these must be labeled as exploratory or data-driven analyses. For more information on updating a preregistration, please see this blog post: https://cos.io/blog/preregistration-plan-not-prison/
Both referencing the preregistration in the methods section and providing a link to the preregistration itself are crucial when writing up preregistered work. The preregistration was written as a means to inform your readers what had been planned in the study, so it is vital they be able to access and read it. Preregistration is a great exercise for the author, but it loses nearly all its value if it cannot be read by others.
For preregistering studies using archival or public research data, it is important to disclose your prior knowledge and exposure to the data at the time of registering. The concern is to what extent the knowledge could have influenced or biased the analytical decisions in the preregistration. Disclosure is key, but too much prior knowledge of the data can impact the usefulness of prereg from a bias mitigation perspective.
The Preregistration Challenge was an education campaign that ended in 2018 and was supported by the Laura and John Arnold Foundation. The campaign included $1000 prizes for researchers who published the results of preregistered work. More information about the Prereg Challenge is available on this resources page.