Data Fabrication and Falsification

Group Members: Will Burnham, Kelly Heffernan, Alyssa Morgan, and Evan Russell 

The truth is always something that scientists and members of the scientific community should strive to achieve. Truth emerges at the intersection of transparency, trust, and honesty. These are the qualities that mark sound science which accurately disseminates findings and information. However, it is possible for scientists to fall short of this mark. Instead of scientists displaying truthful findings, we find that they choose the evidence they desire by manipulating it in their favor (Leng 2020, 159-172). This is known as data fabrication and falsification, and in this practice, scientists manipulate existing data or create new data with no basis in experimentation. Scientists are motivated to do this in order to form data that reflects their desired outcome or matches their hypothesis. This practice occurs more often in scientific writing than it would seem. As a result of fabricating and falsifying data, other scientists or researchers cite these papers that contain fabricated data and then the ongoing spiral of falsification commences. In the end, all the authors and or institutions that cited the original paper containing the falsified data are retracted once discovered. As data fabrication and falsification continues, the motivation for fabrication, the resulting implications and consequences, methods of detection, and preventative measures and ethical considerations will all be examined throughout this webpage in an effort to stop false data’s destructive path through science. 

Motivation for Fabrication & Falsification 

A frequent quote many people use today is “I believe in science”. There are t-shirts (see figure 1), book bags, and many Instagram posts espousing such a sentiment. This blind faith in science stems from the fact that society places scientists on a pedestal. They are expected to be trustworthy, unbiased, dedicated, and uncorrupted by ulterior motives. The reality, as shown in real life examples throughout this webpage, is much more nuanced. While many scientists do follow ethical practices, there are also many who fabricate and falsify the data they draw their conclusions from. This is a dangerous practice, as in fields like medicine where falsified results of drug trials can result in serious injury for future patients who are prescribed the drug. With all of these horrible consequences, why would scientists fabricate data?

Figure 1. This image shows a popular t-shirt design on the online retailer redbubble.com. The text on the t-shirt reads “Trust Science.” (Redbubble 2021). 

One reason for fabricating data could be that some scientists do not realize that they are using poor data practices. For instance, in one heavily cited study about data fabrication (Fanelli 2009, 1), only 33.7% of scientists admitted to questionable research practice that they themselves had conducted, while 72% reported that they had seen questionable research practices in their peers’ work. This shows that people are much less likely to admit faults in their own data than in others (see figure 2). For this reason, scientists could be subconsciously fabricating data and falsifying other items, but genuinely not realizing it due to their own inherent biases. 

Figure 2. This image is taken from (Fanelli 2009, 6). It shows the disparity between scientists admitting to their own mistakes, versus admitting to noticing peer’s mistakes in a survey. Note that QRP stands for questionable research practices. 

Another reason is that many scientists in modern times feel pressured to publish as many articles as they can, as quickly as they can. They believe this will build their reputations as scientists through sheer volume of increased citations. One researcher who was held in high esteem at Bentley University (Nurunnabi and Hossain 2019, 4) was found to have fabricated data in two of his most prominent studies. This kind of behavior can be explained by the need for many scientists to either publish work, or perish. It would be tempting for a researcher to fudge the numbers in order to reach a statistical significance threshold that is needed for publication. Without set ethical guidelines to follow, problems like this can quickly become prevalent. Another young researcher named Joachim Boldt (Mayor 2013, 1) fabricated data in research relating to safe blood plasma substitutes for diabetics. This resulted in people believing that certain plasma substitutes were safe for diabetics when they actually increased death and injury rates, as discovered by a meta-analysis (Mayor 2013, 1). Boldt’s desire for influence overcame his desire to genuinely help people, which may have resulted in patient deaths. 

Figure 3. This image shows the article authored prominently by Adeel Safdar. There is a retraction warning on it in order to show people that any conclusions or results drawn in it should not be relied upon for further research (Safdar et. al. 2015, 1). 

Another scientist who fabricated data is Adeel Safdar from McMaster University in Ontario (Safdar et al. 2015, 1). As shown in figure 3, His article was recently retracted from the Skeletal Muscle Journal, due to fabrication of two images that led to false conclusions. Recently, concerns were raised about the scientific ethics of Safdar after he was accused of torture and domestic abuse of his wife and family, which can be seen in figure 4. As Safdar’s character was taken into question, this led the scientific community to also reexamine his work, and discover instances of data fabrication. Once the scientist’s character is seen as flawed, their data collection and overall experimental designs will be checked thoroughly to make sure they were not also dishonest in their work. Many other papers of Safdar’s are being examined for data fabrication, and as of now three of his papers have been retracted. 

Figure 4. This image shows article headlines from The Hamilton Spectator, a Canadian newspaper. They outline the domestic violence charges Adeel Safdar has recently been accused of. (Editors of Hamilton Spectator 2021). 

Overall, there are many different reasons that motivate scientists either consciously or unconsciously to fabricate their data. No matter the reason they are engaging in this type of false science, the consequences continue to be severe and lasting on the entire scientific community. These consequences will be examined in the next section of this webpage. 

Implications & Consequences of Fabricating & Falsifying Data

Figure 5. This cartoon is from the Los Angeles Times created by Scott Adams. In this cartoon there are two researchers and/ or scientists who have realized that the results from their experiment were not what they had expected. Therefore, one of them brings up that they can just “adjust” (change) the data so that it “supports” what they had anticipated. This cartoon does a great job showing how fabricating and falsifying data takes place in the scientific community.  

When caught committing the practice of fabricating and falsifying data in a published scientific document, the corresponding author(s) and institution(s) have to deal with the severe implications and consequences. Some of these include:

Figure 6. This is Table 1.: Potential Consequences* of Publication Fraud for Authors and Their Institutes from ‘Preventing Publication of Falsified and Fabricated Data: Roles of Scientists, Editors, Reviewers, and Readers’ in Journal of Cardiovascular Pharmacology. The table lists nine different punishments for fabricating and falsifying data in scientific writing.   

For scientists and their research, there are rules that they must abide by, for their data, that are enforced with serious punishments. In addition to the consequences from Figure 6, other punishments include loss of funding, termination of employment, or imprisonment (Resnik 2013, 1). David B. Resnik, a bioethicist, claims that scientists would not have these rules in place with harsh penalties if they did not believe that one of the goals of scientific research is to establish the truth of theories and hypotheses about the unknown. This claim is not acceptable by all individuals, especially one constructive empiricist, van Fraassen. As a constructive empiricist, van Fraassen believes that the goal of science is to produce theories that are experimentally correct rather than true. For a theory to be experimentally correct, that theory can be visually proven/ established with conducting an experiment (Resnik 2013, 1). Others like van Fraassen have to be able to explain why rules pertaining to data fabrication and falsification do not threaten their philosophy of science.

Figure 7. This newspaper is from Science Magazine and has the article of Luk van Parijs, a former MIT Researcher, and how his contract with MIT was terminated because he was found guilty of committing data fabrication in some of his published work on RNA interference.

Back in 2005 there was a case of an MIT researcher studying RNA interference, Luk van Parjis, whose contract with MIT was terminated after committing data fabrication and falsification in five grant applications, ten scientific manuscripts, seven which were published, an unpublished book chapter and several presentations, all of which dated back to as far as 1997 (Couzin 2005, 1). As a researcher, he was trying to use the method, which can change gene expression, as a tool for studying normal physiology and disease (Couzin 2005, 1). Researchers at the California Institute of Technology (Caltech), where he worked before MIT, were all surprised to see that he would fake anything as he was an excellent scientist (Couzin 2005, 1). After confessing to falsifying data in some of his published works, he agreed not to take part in any work funded by the United States government from 2005 to 2010. Now, he no longer participates in any scientific research and refuses to respond to voicemail or email interview requests from news stations. 

Figure 8. The top two maps are from ‘Ireland after NAMA’ and show the median prices for properties in different counties of Ireland in 2010 and 2012. The bottom two maps are also from ‘Ireland after NAMA’ and show the change in median price for properties from 2010 to 2012, both in actual value and percent change.

Back in 2012, another case took place in Ireland during the month of October. During this time, the Property Services Regulatory Authority (PSRA), who are responsible for licensing and regulating property services providers in Ireland, launched the Residential Property Price Register (RPPR). Rob Kitchin, the author of the book, Data Lives: How Data Are Made and Shape Our World, and one of his colleagues, Eoghan McCarthy, decided to examine the datasets, from the RPPR, of individual residential property sales prices across Ireland to see the geographic pattern of prices. They were able to learn that there were some errors within the data. Eoghan was able to find more than 200 possible errors among all the data (Kitchin 2021, 37). These errors were identified by organizing the data by price value (Figure 8). These errors were then confirmed by a real-estate economist, Ronan Lyons. Then, after he and his colleagues compared the RPPR to their nationwide sales numbers, they removed 7.7 percent of properties that showed a significant difference in the sale price as advertised and recorded (Kitchin 2021, 37). The PSRA decided not to make any corrections to the RPPR as the integrity of the data set was the responsibility of a third party, The Revenue Commissioners. The overarching consequence of this is that the errors remain for a long time and that not all errors may ever be changed. Although this case of data fabrication and falsification is not a scientific writing example, it still shows us that this issue has been ongoing and worldwide within other fields.

Figure 9. These two graphs are from the article ‘Researcher at the center of an epic fraud remains an enigma to those who exposed him’ from Science Magazine, Tide of Lies. The graph titled, Total scientific output, shows the number of scientific papers that Yoshihiro Sato published throughout his career until he died in 2016. The graph titled, Clinical trials, shows the number of patients from 33 of his clinical trials.

One other case was discovered in March of 2017 by a clinical nutritionist, Alsion Avenell, at the University of Aberdeen in the United Kingdom. After reading through all of Yoshihiro Sato’s, a bone researcher at a hospital in southern Japan, papers with three of her colleagues in New Zealand, they had found that Sato “had fabricated data for dozens of clinical trials published in international journals” (Kupferschmidt 2018, 1). At the time of this discovery, the whereabouts of Sato’s location was unknown. It was unsure whether or not he had committed suicide as they were “aware of the culture in Japan and the dishonor something like this could bring” on a person (Kupfershmidt 2018, 1). This fabrication and falsifying event was one of the biggest in scientific history. The worldwide ripple effect “of his fabricated reports, many of them on how to reduce the risk of bone fractures” went far and wide (Kupfershmidt 2018, 1) (Figure 10). Other researchers performed new trials of his experiments that enrolled thousands of real patients which over all ended up “exposing Sato’s lies and correcting the literature” (Kupferschmidt 2018, 1). One common thought that was in question during all of this was whether or not Sato’s colleagues, the co-authors of many of these falsified papers, knew about what was going on. Specifically, people were curious if the other doctors at his hospital ever read his work “and whether the Japanese scientific community ever questioned how he managed to publish more than 200 papers (Figure 9), many of them ambitious studies that would have taken most researchers years to complete” (Kupferschmidt 2018, 1). Although they were able to prove that fraud had been committed they were still unsure of “the personal and cultural factors” that drove him to do such a thing (Kupferschmidt 2018, 1). In the end, “21 of Sato’s 33” trials have been retracted by either the journals or himself (Kupferschmidt 2018, 1). As of the time this article was published, he was number six on the “Retraction Watch’s list of researchers who have racked up the most retractions” (Kupferschmidt 2018, 1). After an interview between Avenell and Satoshi Ogawa, Sato’s lawyer, a note, written by Sato, was talked about. This note was a suicide letter that said that Sato was sorry for what he had done and that he was going to commit suicide. Unfortunately, there was no time to be able to prevent this from happening as the note was received after it had occurred.

Figure 10. This diagram is from the article ‘Researcher at the center of an epic fraud remains an enigma to those who exposed him’ from Science Magazine, Tide of Lies. It shows the top 12 trials performed by Yoshihiro Sato that had been widely cited by other researchers, scientists, institutions, etc. More specifically, this diagram shows when the trial took place, the number of patients involved in that trial, the number of references that were made to that trial by year and when all corresponding documents that had cited that trial were retracted.

 

It is evident from these examples alone that the falsification and fabrication of data does not come without consequences. Its effects can powerfully alter the course of individual lives and science as a whole, often for the worst.

Methods of Detecting Fabrication & Falsification 

Detecting flaws in data is the first step to recognizing and retracting falsified or fabricated data to prevent its far-reaching consequences. However, detecting data fabrication is not a simple process, requiring standards of measuring the accuracy of various types of data (Kitchin 2021, 41). As stated by data researcher, Rob Kitchin, “The fact that myself and my colleagues continually struggle to discover such information and have to extensively practice data wrangling has made it clear to us that there is still a long way to go to improve data quality and its reporting” (Kitchin 2021, 44). Kitchin highlights that even those who are well-versed in analyzing the quality of data struggle to determine just how reliable data truly is and if data should be retracted, or withdrawn from publishing. In short, there is no clear cut way to determine if data is true or false, or simply if it is influenced by bias. However, data falsification can be recognized and can result in the successful retraction of unreliable works, as shown in Figure 11 (Nurunnabi 2019, 116). Data falsification often has warning signs, which can be applied in a case-by-case manner to the various fields in which data can be relevant.

Figure 11. This chart, created by Mohammad Nurunnabi, represents the distribution of why articles are retracted, with a majority of cases being due to data fabrication or falsification. This data exemplifies just how great the problem of falsified data is, while also exhibiting the possibility of falsification to be recognized and handled accordingly.

One major concern with data is that of representativeness: how much is the data actually revealing what it was attempting to measure in the first place (Kitchin 2021, 41)? Although misrepresentation is not always intentional in its resulting deceit, often one’s manner of interpreting and conveying information can blur the true meaning and intentions of data, which turns into data falsification when intentional. Between collecting data and spreading it, there is always a party in charge of its interpretation to bridge the gap between these two steps (Leng and Leng 2020, 159-172). Data cannot interpret itself, and this is often where the issue of representativeness becomes a problem (Leng and Leng 2020, 159-172). Processes, including extraction, abstraction, generalization, and sampling, involve instances where bias and lacking precision can step in (Kitchin 2021, 41). It is important to be aware that the method of data collection can affect how it is interpreted and conveyed, including factors such as sample size, demographics, and the population from which data is derived (Kitchin 2021, 41). For example, a study with a very small sample size will be much less reproducible and will have lower statistical power than if the same results were obtained from a larger sample size. In this case, any generalizations made from the data must be placed in the context of a small sample size to avoid drawing conclusions based on the data that would not be applicable to a larger sample size. This same thought process must be applied to all data. It is fair to conclude that certain deceiving aspects of data collection are not always easy to recognize, especially for the general public, but an awareness of where data can become flawed is crucial, as intentional misrepresentation is directly linked to data fabrication.

Despite the importance of having an individual awareness of where data may be falsified, whether intentionally or through unnoticed bias in action, there are international-level efforts to detect poor data (Kitchin 2021, 42). In fact, there have been formal measurement systems for classifying data as either proper or poor quality. These measurements, as shown in Figure 12, involve veracity, the completeness of the data, timeliness, coverage, accessibility, lineage, and provenance (Kitchin 2021, 42). However, not everybody follows these standards to ensure that the data is of proper high quality, and it can be challenging to apply such general standards to specific fields. Oftentimes, efforts are only made when flaws in the data would affect a larger crowd, meaning it would be more likely for people to notice any flaws (Kitchin 2021, 42). For these reasons, when analyzing data and research, it is crucial to think about who the researcher’s intended audience is and how this could affect the researcher’s motives and possible shortcomings.

Figure 12. The visual above outlines the eight measures of quality data that are used on an international scale. Each of these measures are general and can be applied to specific fields or research in an individualized manner to get a better idea of just how reliable data is. Although there is no singular way to resolutely determine if data is true or false, these measures bring researchers closer to a conclusion on the reliability of data.

One field where data fabrication is prevalent and especially dangerous is in clinical trials, which evaluate new treatments and medicines in the medical field. Accidents in data collection and analysis can happen. For this reason, it is vital that quality standards are established for researching, experimenting, and citing and that these standards are made aware to everyone involved with research. However, to help identify data falsification, prominent distinctions can be identified between accidents in data falsification and purposely falsified data (Marzouki, et al. 2005, 267). In one study, two clinical trials were examined and their data analyzed to determine if these data were reliable or flawed in some way (Marzouki, et al. 2005, 267). The first trial in question was the diet trial, surrounding a controlled trial of how fruits and vegetables in a diet affect patients with coronary heart disease, who were stated to be randomly assigned to the experimental and control groups (Marzouki, et al. 2005, 267). The other trial was that of a randomized drug trial, studying how drug treatment affected patients with mild hypertensions (high blood pressure) (Marzouki, et al. 2005, 267).  The data for both trials could be analyzed quantitatively, both separately and in comparison to one another, in order to come to recognize trends that could be possible indications of fabrication or falsification.

In an effort to recognize data falsification, the data was first checked for digit preference. Digit preference is defined as “the habit of reporting certain end digits more often than others” (Camarda, Eilers and Gampe 2015, 895). Human-recorded data often shows preference towards certain numbers, as opposed to machines, with humans often favoring either multiple of 5 or 10, or even numbers rather than odd, which can threaten the accuracy of data when nudging numerical data towards these favorable numbers (Camarda, Eilers and Gampe 2015, 895). For example, if a study’s data predominantly ends in even numbers, rather than odd, it can be assumed that possible rounding took place with the data to make the numbers seem more favorable. As a result, digit preference can be a strong indication of possibly falsified or unreliable data. Although chance can result in digit preference, the two clinical trials examined in the discussed study should have had a similar pattern of preference due to the randomization stated in creating the experimental and control groups (Marzouki, et al. 2005, 268). However, the results proved that the diet trial had notable differences in standard deviations for height and cholesterol measurements, while the digit presence for all variables was very strong (Marzouki, et al. 2005, 268). This led the researchers to believe that this data was not entirely raw and unaltered by its experimenters. 

Another major difference in these data included the magnitude of the P values in each data set (Marzouki, et al. 2005, 268). The P value is a statistical tool used in science that indicates the strength of evidence, with smaller P values referring to a greater significance of data, and analyzes the probability of data/results being reproduced by chance (Nuzzo 2014, 152). The P value is a strong motivator for scientists when experimenting, as a P value can determine how a researcher’s work is received, and, therefore, if it is cited or not. Therefore, analyzing P values is a necessary step to determine if data was falsified to make it appear as if data is more significant than it truly is. There were noticeable differences in mean and variance between baseline variables in the diet trial, as shown in Figure 12, showing that the groups were not randomly allocated as the author had claimed (Marzouki, et al. 2005, 268). Likewise, the P value, together with the significant difference in digit preference, served as evidence that further flaws in data did not occur by chance (Marzouki, et al. 2005, 268).Therefore, the lack of randomization in this process could be due to differences between the means of the baseline variables (Marzouki, et. al 2005, 268). Although one irregular pattern, such as digit preference, is not a definite indication of data fabrication, as it could simply be attributed to different people recording data for each group, the combination of digit preference, with differences in means and variances, points towards data fabrication. As exemplified by the analysis of this data, the tools used in evaluating data are specific to the field and presented data, as this clinical trial is mainly concerned with quantitative data. However, these specific statistical tools may not be useful in determining the validity of data from other fields that involve more qualitative analysis.

Figure 13. The figure above, taken directly from the comparison experiment between the diet and drug clinical trials study, shows the data studied to detect data falsification and fabrication. The table shows the P values and degrees of freedom, referring to digit preference, in both the diet and drug studies. This set of data highlights how the drug study yielded randomized data, while the diet study had greater variations in standard deviation, signaling possible data fabrication.

It is evident through this single example just how shaky the evaluation of data fabrication can be. How can we truly be sure that data is made up or true? Can one ever be 100% certain that chance played a part in certain results? This field of recognizing and abolishing data fabrication is constantly evolving as more cases are studied in various fields, but for now, we have to work with existing evidence and intuition. The difficulty in spotting data fabrication and falsification only supports the idea that this practice is dangerous to the integrity of science.

Preventative Measures & Ethical Practices 

Preventative measure and ethical practices are vital to creating an environment and culture in the scientific community that strives to prevent falsified and fabricated data. Preventative measures are found in two places in the research timeline. The research timeline is the sequential steps we take to conduct research. This begins with a question, followed by a hypothesis, testing, conclusion, and finally publishing your results.

 

Figure 14: Progression of the scientific process, and where to stop data fraud and falsification in the timeline.

The first section in the timeline is preventing falsified and fabricated data before it is created in the testing stage. Many examples of fraudulent work are caught after publication. For this reason, the second section focuses on snuffing out falsified and fabricated data after it has been created, but before it is published, so the incorrect information is not shared.

For a long time, science education has been about broadening the scope of our understanding. Now things may be shifting towards teaching ethics in science to students (Reiss, 1999). In a perfect world, science would be performed on a completely transparent and honest basis. While it is not possible for all scientists to be perfect, it is still important to create a culture of honest work. This can begin early in a student’s education. Educating students on data fraud will “… increase awareness, they will also encourage a mindset in which issues can be discussed earlier and easier” (Korte, 2017). By educating students on the dangers, they will be less likely to fabricate or falsify data. This increased awareness from education should also help break down the detrimental “publish or perish” culture found in labs around the world.

But learning shouldn’t stop with formal education up to, and through college. It is important that scientists continue to learn about ethics and honest work during their scientific careers. Many have explored the possibility of web based learning to continue our understanding of ethics in research. The ethicist Michael Pritchard found that post bachelors degree web based learning has the potential to create a collective responsibility for the research being done (Pritchard, 2005). This increased awareness for everyone involved in the research could help prevent the use of fraudulent and falsified data or have others notice before it is published. However, learning about ethical research conduct is often more subtle than a structured program, but it can be found in codes of ethics. A code of ethics is a guide of sorts that is adopted by a particular organization or scientific community that helps us distinguish right from wrong. Scientists are sometimes not even aware that they have created fraudulent data, but frequently referencing and being educated of the code of ethics, help us ensure we are operating within the guidelines that lead to “good science”. Many organizations draft or create their own code of ethics. For example, a group of students at Worcester Polytechnic Institute drafted a code of ethics for robotics, a relatively new field in the grand scheme of science. In this code they included a piece that stated “[To this end and to the best of my ability I will] not knowingly misinform, and if misinformation is spread do my best to correct it” (Ingram et al., 2010). Most codes of ethics or editorial policies will include a piece with this same idea, such as The Association of Clinical Research Professionals who set forth a code use in scholarly work. The code states researchers should “Report research findings accurately and avoid misrepresenting, fabricating or falsifying results” (ACRP, 2020). Enforcing that scientists adhere to their respective code will often mean that they are actively avoiding data fraud and fabrication by not spreading misinformation. Proactive solutions to catch and prevent problems before they

Figure 15. A flow chart describing the steps of peer review.

happen are better, but often are necessary to react when the latter methods fail.

While the scientific community generally prevents data fraud in the first place, it is something that happens, therefore it is important to have preventative measures for keeping fraudulent data from being published, this means performing checks like peer reviews and replica trials. Peer reviews are a great method for other members of the scientific community to review the publications to verify the work. Peer review can expose inconsistencies and false findings. This works by other members in your academic field reviewing your work by reading through it to identify pieces that don’t make sense or may be inconsistent with reality. There are many varieties and techniques for peer review but “Many believe [open review] is the best way to prevent malicious comments, stop plagiarism, prevent reviewers from following their own agenda, and encourage open, honest reviewing” (Elsevier, 2021). In the figure to the right, the steps of the peer review process are shown. It is very thorough and has multiple failsafe’s to catch fraudulent or falsified work. In addition to peer review some schools may have the resources for an in house reviewer. An in house review is someone with formal training in reviewing scholarly writing for things like reproducibility, accuracy, and the presence of falsified or fraudulent data. (give every paper a read for reproducibility).

Sometimes, it is found that peer review is not thorough enough to test the validity of a particular result from a study or research. This is the function of replica trials. If someone tells you it’s sunny outside, you will probably look out the window too before you head out, you never know, it could be raining. When researchers produce detailed methods, it allows other researchers to perform the exact same work and compare the results for validity. One challenge with replica trials is ensuring that a scientist’s work is in fact detailed enough to be reproducible. Detailed work means that another scientist may accurately reproduce the exact experiment and compare the resulting data to see that there is agreement between the independent trials. To ensure that scientists are producing replicable data there are several techniques that are seen as good solutions. A 2016 survey of scientists in a variety of fields found 90% of respondents deemed that experimental reproducibility could benefit from more robust experimental design, better statistics, and better mentorship (Baker, 2016). Another method of preventing data fraud through better reproducibility is simply not allowing people to fall into the temptation by keeping strict records of who is performing which experiments, and locking files after data collection to prevent manipulation. If researchers choose to keep data out, there must be a strong justification that the data should not be used. In the end, scrutiny and verification of others’ work by either method is an effective and respectful way to prevent the publication of fraudulent data. 

Preventing data fraud is an ongoing challenge for the scientific community, and methods of prevention are constantly improving as we educate future generations of scientists and review each other’s work with a more critical lens. Data falsification and fabrication is a widespread issue that has the power to put the integrity of science and the trustworthiness of researchers in danger. Although there is no definite solution to stopping data fraud, the problem must be addressed through spreading awareness and recognizing trends. By recognizing why scientists falsify and fabricate data and how it affects both individual scientists and the world of science as a whole, tools can be implemented to recognize and prevent data falsification. Science is the foundation for our understanding of how the world works, so it is our responsibility as scientists to continuously search for truth and uphold the prestige of science.

 

Bibliography

Al-Marzouki, Sanaa, Stephen Evans, Tom Marshall, and Ian Roberts. “Are These Data Real? Statistical Methods For The Detection Of Data Fabrication In Clinical Trials.” BMJ: British Medical Journal 331, no. 7511 (2005): 267-70. Accessed April 8, 2021. http://www.jstor.org/stable/25460301

Brandon Ingram, Daniel Jones, Andrew Lewis, and Matthew Richards. “A CODE OF ETHICS FOR ROBOTICS ENGINEERS,” March 6, 2010.

Couzin, Jennifer. “MIT Terminates Researcher over Data Fabrication.” Science 310, no. 5749 (2005): 758. Accessed April 8, 2021. http://www.jstor.org/stable/3842728.

Dyer, Clare. “Diabetologist and Former Journal Editor Faces Charges of Data Fabrication.” BMJ: British Medical Journal 356 (2017). Accessed April 12, 2021.  https://www.jstor.org/stable/26949701

Eisenach, James C. 2009. “Data Fabrication and Article Retraction: How Not to Get Lost in the Woods.” Anesthesiology 110 (5): 955–56. https://doi.org/10.1097/ALN.0b013e3181a06bf9 

Fanelli, Daniele. “How Many Scientists Fabricate and Falsify Research? A Systematic Review and Meta-Analysis of Survey Data.” PsycEXTRA Dataset, 2009.  https://doi.org/10.1037/e521122012-010.

Fong, Eric A., and Allen W. Wilhite. 2017. “Authorship and Citation Manipulation in Academic Research.” PLOS ONE 12 (12): e0187394.https://doi.org/10.1371/journal.pone.0187394

FUENTES, GABRIEL A. “FEDERAL DETENTION AND “WILD FACTS” DURING THE COVID-19 PANDEMIC.” The Journal of Criminal Law and Criminology (1973-) 110, no. 3 (2020): 441-76.   Accessed April 8, 2021.https://www.jstor.org/stable/48573788.

García-Pérez, Miguel Ángel. “Bayesian Estimation with Informative Priors Is Indistinguishable from Data Falsification.” The Spanish Journal of Psychology 22 (2019). https://doi.org/10.1017/sjp.2019.41.

George, Stephen L, and Marc Buyse. “Data Fraud in Clinical Trials.” Clinical Investigation 5, no. 2 (2015): 161–73. https://doi.org/10.4155/cli.14.116.

Gropp, Robert E., Scott Glisson, Stephen Gallo, and Lisa Thompson. “Peer Review: A System under Stress.” BioScience 67, no. 5 (May 1, 2017): 407–10. https://doi.org/10.1093/biosci/bix034.

Kai Kupferschmidt Aug. 17, 2018, 2021 Jeffrey Mervis Apr. 26, 2021 Science News Staff Apr. 23, 2021 Cathleen O’Grady Apr. 22, 2021 Jocelyn Kaiser Apr. 22, Kai Kupferschmidt Apr. 22 Gretchen Vogel, 2021 Sofia Moutinho Apr. 7, et al. “Researcher at the Center of an Epic Fraud Remains an Enigma to Those Who Exposed Him.” Science, August 22,  2018.https://www.sciencemag.org/news/2018/08/researcher-center-epic-fraud-remains-enigma-those-who-exposed-hi.

Khaled, K. F. 2014. “Scientific Integrity in the Digital Age: Data Fabrication.” Research on Chemical Intermediates 40 (5): 1815–49.https://doi.org/10.1007/s11164-013-1084-5.

Kitchin, Rob. “In Data We Trust.” In Data Lives: How Data Are Made and Shape Our World, 37-44. Bristol, UK: Bristol University Press, 2021. Accessed April 8, 2021. doi:10.2307/j.ctv1c9hmnq.9.

Korte, Sanne M., and Marcel A. G. van der Heyden. “Preventing Publication of Falsified and Fabricated Data: Roles of Scientists, Editors, Reviewers, and Readers.” Journal of Cardiovascular Pharmacology 69, no. 2 (February 2017): 65–70. https://doi.org/10.1097/FJC.0000000000000443.

Leng, Gareth, and Rhodri Ivor Leng. “Where Are the Facts?” Essay. In The Matter of Facts: Skepticism, Persuasion, and Evidence in Science, 159–72. Cambridge, MA: The MIT Press, 2020. 

Lu, Zaiming, Qiyong Guo, Aizhong Shi, Feng Xie, and Qingjie Lu. “Downregulation of NIN/RPN12 Binding Protein Inhibit the Growth of Human Hepatocellular Carcinoma Cells.” Molecular Biology Reports 39, no. 1 (2011): 501–7. https://doi.org/10.1007/s11033-011-0764-8.

Mayor, Susan. “Questions Raised over Safety of Common Plasma Substitute.” BMJ: British Medical Journal 346, no. 7896 (2013): 4. Accessed April 14, 2021.  http://www.jstor.org/stable/23494149.

Monya Baker. “Is There a Reproducibility Crisis?” Nature 533 (May 26, 2016): 452–54.

National Academies of Sciences, Engineering, and Medicine. Reproducibility and Replicability in Science. The National Academies Press. Accessed April 26, 2021.  https://doi.org/10.17226/25303.

National Academy of Sciences. 2009. RESEARCH MISCONDUCT. On Being a Scientist: A Guide to Responsible Conduct in Research: Third Edition. 3rd ed. National Academies Press (US).https://www.ncbi.nlm.nih.gov/books/NBK214564/

Nurunnabi, Mohammad, and Monirul Alam Hossain. “Data Falsification and Question on Academic Integrity.” Accountability in Research 26, no. 2 (2019): 108–22.  https://doi.org/10.1080/08989621.2018.1564664.

Nuzzo, Regina. “Statistical Errors.” Nature 506 (February 13, 2014): 150–52. https://www.nature.com/collections/prbfkwmwvz

Open Science Collaboration. “Estimating the Reproducibility of Psychological Science.” Science 349, no. 6251 (August 28, 2015): aac4716–aac4716. https://doi.org/10.1126/science.aac4716.

“Peer-Review Fraud — Hacking the Scientific Publication Process – ProQuest.” Accessed April 26, 2021.  https://search-proquest-com.ezpxy-web-p-u01.wpi.edu/docview/1750062674?accountid=29120&pq-origsite=primo.

Pritchard, Michael S. “Teaching Research Ethics and Working Together.” Science and Engineering Ethics 11, no. 3 (September 1, 2005): 367–71. https://doi.org/10.1007/s11948-005-0005-4.

Reiss, Michael J. “Teaching Ethics in Science.” Studies in Science Education 34, no. 1 (January 1, 1999): 115–40. https://doi.org/10.1080/03057269908560151.

Resnik, David B. “Data Fabrication and Falsification and Empiricist Philosophy of Science.” Science and Engineering Ethics 20, no. 2 (2013): 423–31. https://doi.org/10.1007/s11948-013-9466-z.

The Association of Clinical Research Professionals. “Code of Ethics.” ACRP, June 17, 2020. https://acrpnet.org/about/code-of-ethics/.

Safdar, Adeel, Konstantin Khrapko, James M. Flynn, Ayesha Saleem, Michael De Lisio, Adam P. Johnston, Yevgenya Kratysberg, et al. “RETRACTED ARTICLE:Exercise-Induced Mitochondrial p53 Repairs MtDNA Mutations in Mutator Mice.” Skeletal Muscle 6, no. 1 (2015). https://doi.org/10.1186/s13395-016-0075-9.

“What Is Peer Review?” Accessed April 26, 2021. https://www.elsevier.com/reviewers/what-is-peer-review.

Winchester, Catherine. “Give Every Paper a Read for Reproducibility.” Nature 557, no. 7705 (May 2018): 281–281. https://doi.org/10.1038/d41586-018-05140-x.

Image Citations:

 Scott Adams. Dilbert. https://enewspaper.latimes.com/infinity/article_share.aspx?guid=87c2c088-17ed-424e-afd7-a784b0e73086

 Editors of Hamilton Spectator. Adeel Safdar domestic violence trial: The Hamilton Spectator. Retrieved April 26, 2021, from https://www.thespec.com/news/hamilton-region/adeel-safdar.html

Ireland after NAMA, October 4, 2012, accessed April 28, 2021 https://irelandafternama.wordpress.com/2012/10/04/the-geography-of-actual-sales-prices/

Korte, Sanne M., and Marcel A. G. Van Der Heyden. “Preventing Publication of Falsified and Fabricated Data: Roles of Scientists, Editors, Reviewers, and Readers.” Journal of Cardiovascular Pharmacology 69, no. 2 (2017): 65-70. doi:10.1097/fjc.0000000000000443.

 Redbubble. Trust science (Black BG) Classic t-shirt. Retrieved April 26, 2021, from https://www.redbubble.com/i/t-shirt/Trust-Science-Black-BG-by-Thelittlelord/48496274.IJ6L0.XYZ