DIA Japan CDM Community Workshop
A2 Healthcare Corporation
Eli Lilly Japan K.K.
GlaxoSmithKline K.K.
o accommodate increasingly diversified methods of clinical data collection, clinical data management (CDM) must pursue and realize “Shinka”: a set of three Japanese homophones which mean “evolution, cultivation, and true value.” CDM is no longer a narrow, isolated function conducted by highly specialized experts or through locally unique activities. This expanding scope was reflected by the affiliations of the leaders and participants in the 27th DIA CDM Workshop in Japan, which encompassed pharmaceutical companies, contract research organizations (CROs), academic institutions, IT vendors, and the global CDM community of DIA.
Use of AI in CDM
Experiences with AI in the workplace are found to vary greatly. Still, there was consensus that:
- There are areas for which AI use is suitable and areas where robotic process automation (RPA) may be more appropriate. Proper understanding of what AI can and cannot be utilized effectively for will be critical for the future shape of CDM.
- When AI is utilized, there is no such thing as 100% pre-process validation beforehand. Collation between the “right” answer and the actual output may be possible with conventional programming (which calculates based on how it was pre-planned or programmed) but not with AI processes (which mold themselves as they function on acquired training/teaching and deep learning).
- The CDM community has conventionally requested 100% data validation. But the introduction of AI is now urging a change in this “sacred rule” and mindset.
The workshop presented a prototypical example of applying generative AI to healthcare: An AI program—more precisely, a large language model (LLM)—that had not been specifically trained with health-related information was fed certain disease information and instructed to suggest appropriate treatment options. A few cases produced erroneous results that would have been serious enough to warrant human intervention. This result suggests that if even an “unprepared” AI program can produce fairly accurate results, an AI program could be used for data cleaning (see below), an important element of CDM, after “preparation”; and that when future LLMs are improved, generative AIs based on such LLMs can perform even wider and more powerful functions.
Decentralized Clinical Trials (DCTs) and CDM
Keynote Speaker Professor Kenichi Nakamura (National Cancer Center Hospital), a leading voice for DCT implementation in Japan, emphasized the importance of CDM’s role in the digitization of clinical trials, which is evolving in the larger context of the digitization of medicine. Relevant CDM factors include standardization of the data structure and assurance of data interoperability while preserving data security and privacy; their improvement contributes to enhancing the performance of the clinical trial ecosystem. Clinical data managers with capabilities in both clinical/scientific and technical aspects of CDM are best suited to lead such progress. Clinical data managers more intentionally participating in the early stages of designing DCTs can potentially enhance data quality and integrity and enable smooth data processing from direct data capture to preparing trial reports.
Common Data Standards and CDM
Data from clinical studies are currently reported in accordance with multiple data standards. It is an important responsibility of CDM to understand the role of each standard and to utilize them appropriately. During discussion of a globally standardized coding dictionary, it was reported that reciprocal links between major coding dictionaries, such as MedDRA and SNOMED CT, had already been developed and were being maintained by Maintenance and Support Services Organizations (MMSOs).
Diversifying Post-Marketing Surveillance and CDM
As in pre-market clinical trials, various methods of data collection and utilization are also being employed in post-marketing surveillance (PMS). Utilization of real-world data (RWD) is also increasing in this field. What is CDM’s potential role in this emerging RWD front?
Data obtained from patient registries have unique characteristics that distinguish them from clinical trial data; for example, data obtained from clinical trials conducted according to GCP are monitored by sponsors and their quality is controlled. But data from patient registries are usually anonymized, and the chances of tracing back to their source data are often minimal (because of regulations to protect patients’ privacy). To compensate for this difficulty, registry data must be “cleaned” (by fixing or removing data that is known to be incomplete, inaccurate, or outside the scope of the research question) according to well-planned and well-documented processes. In addition, Japan’s regulatory authority generally does not specify the data validation levels necessary for the tool/system to utilize patient registry data for post-market studies, leaving users to define this validation independently. The need for unique system/tool validation and process standardization makes the capabilities needed to manage RWD different from those needed for clinical trial data, the first insight towards creating CDM practices tailored for RWD instead of applying CDM only to trials conducted according to GCP.
Learn more about this topic at The 28th DIA Japan Annual Workshop for Clinical Data Management, February 2025 in Tokyo.