24 February, 2016, by ClinCaptureTeam
Billions of clinical measurements are recorded every day, generating stores of medical data beyond what anyone could have imagined just a few decades ago. The contents are called big data and the clinical research industry, although slow to see its potential, is now eager to put to use electronic versions of everything from notes doctors jot down about patients to clinical trial results.
“With enormous data come enormous opportunity,” reads the line in bold on the website of Nicholas Tatonetti, an expert in biomedical informatics, an interdisciplinary field that combines information technology and medicine,
It’s an optimism shared widely across the healthcare industry and researchers imagine a revolution within clinical trials.
Big Data in Clinical Trials: The Hopes
“In the big-data era, the mysteries of many chronic diseases will be revealed,” said Stephen Wang, the founder of a Chinese bioscience publishing house.
Likewise, in oncology, clinicians see the possibility of using personal medical and population genomics data to improve clinical trials. For example, they can better recruit subjects likely to respond to treatments and exclude those likely to have side effects.
That could help improve the rate of drugs that make it through expensive, time-consuming clinical trials, and improve outcomes for cancer patients.
The optimism also extends to finances. McKinsey & Co. analysts estimated that using big data to make more efficient decisions could generate up to $100 billion in value annually across the US health-care system.
Big Data in Clinical Trials: The Myths
With so much hype about big data it can be difficult to weed out fact from fiction.
The data itself can cause misunderstandings. Perhaps the simplest way to define big data is as a collection of data that’s approximately bigger than one terabyte and/or is too big to handle using standard software and analytical processes.
To put that into context, by one estimation, a terabyte of data requires about 1400 CD-ROMs , 220 DVDs or 40 single-layer Blu-ray Discs. Now consider that by 2020, the overall amount of the amount of digital bits we produce is expected to equal the number of stars in the universe.
The question then becomes “how to find the gold among the dross?”
Artificial intelligence and machine learning are driving the development of tools that can process that amount of information. Supporting these efforts is the $200 million President Barack Obama invested in a National Big Data and Research and Development Initiative.
But big data is not so big that flaws become irrelevant to an accurate analysis, especially in a clinical trial where good information is critical to participants’ safety and the validity of results.
Even the solutions can have drawbacks. Take storage for example. Cloud technology (using off-site servers to store data) has developed in tandem as a place to keep the ever-growing amount of data being collected. In clinical trials, cloud technologies are also offer a way to bring down skyrocketing costs and streamline the process. But, trial investigators cite data privacy and security as major concerns about storing sensitive data in the cloud, which can be more vulnerable to cyber break-ins than off-line servers, especially if staff are not trained to understand the technology’s weaknesses.
For all the buzz, even the staunchest champions understand the limitations of big data in the complex infrastructure of a clinical trial. The idea is that big data will make the process more efficient with better results. And, as Tatonetti points out, even when trials fail they produce valuable data ready to be mined for further analysis.
“Each one of these experiments,” he said, “is a window into the human system, creating the most comprehensive and diverse medical data set ever imagined.”
Interested in the Cloud for Clinical Trials? We invite you to sign up for free for ClinCapture , our cloud-based eClinical system at ClinCapture .com


