AyusArogya Ltd.
DIA AI in Healthcare Community Co-Chair
VB Insights, LLC
DIA RIM Working Group
he pharmaceutical industry has been experiencing a paradigm shift in recent years with the increasing importance of data-centric submissions and new guidance around AI and digital transformations. This, in turn, has increased the importance of data standards, which hold the promise of improving data quality, interoperability, and regulatory compliance. However, given the accelerating intent to adopt and keep pace with rapid evolution in technology, this transition has introduced “Data Standards Fatigue,” the overwhelming feeling experienced by organizations and individuals when faced with the continuous introduction and evolution of data standards from standards organizations and regulatory bodies.
Companies find themselves allocating substantial resources to implement, manage, and govern these standards, often at the expense of core research and development activities. The constant need for adoption and fear of noncompliance penalties, combined with shifting timelines, create a sense of fatigue and frustration among industry professionals. Rapid evolution of technology and data sources exacerbates the challenge, as companies must continuously adapt to new data types and sources.
There is an urgent need to catalogue the standards relevant to the pharmaceutical industry, identify core standards and any overlaps, consider their impact on operational efficiency, innovation, finance and compliance, and propose potential strategies to simplify and mitigate the impact on the industry.
Since the very beginnings of the ICH Common Technical Document (CTD) and the Electronic Common Technical Document (eCTD) for submissions, several standards have emerged to simplify interoperability and data sharing within and across organizational boundaries. While data standards address the needs of multiple stakeholders and provide opportunities for interoperability and integration, each poses its own challenges for organizations in understanding and implementing processes and systems.
Interoperability Is Challenged by Overlaps
Industry faces significant challenges in ensuring interoperability between data sets and systems due to the presence of multiple, often conflicting, data standards. Standards such as ISO IDMP and FDA PQ/CMC overlap. Figure 1 below shows which areas of drug product and drug substance domain are in scope of PQ/CMC and which parts overlap with ISO 11238 and ISO 11615 standards. As published on the FDA website1, this is a high-level mapping/alignment and does not represent data element details. Subdomain labels in the Venn diagram below are PQ/CMC-centric, illustrating the PQ/CMC categories/subdomains published in the two FDA Federal Register Notices (FRNs) in 2017 and 2022. The focus is to show which subdomains of PQ/CMC Phase 1 have overlapping semantics with ISO 11238 and 11615. It is not intended to show all the areas that are in IDMP scope. Additionally, other regulatory requirements vary by region: the US FDA, EMA (Europe), and other global regulators demand different data formats, terminologies, and reporting standards.

These challenges underscore the need for continuous investments in technology, expertise, and integration solutions to enhance interoperability and streamline regulatory processes. One promising initiative addressing these issues is the DIA RIM Reference Model, which aims to standardize data nomenclature for Regulatory Information Management. This model is particularly useful in complex scenarios like mergers, acquisitions, and data migrations; several in-process case studies seem to suggest that this will help to improve data management and interoperability across the industry.
Regulatory Requirements and Data Standards
New data standards are introduced regularly to enhance the accuracy, consistency, and interoperability of data exchanged between pharmaceutical companies, regulators, and healthcare providers. This evolution reflects the increasing need for more streamlined data management practices across areas such as clinical trials, manufacturing, regulatory, and post-market surveillance.
There have been global efforts, starting with WHO and ICH, to implement internationally harmonized specifications and standards for drug development activities and medicinal products. For example, ISO has developed a suite of five standards for uniquely identifying medicinal products (IDMP) with consistent documentation and terminologies that facilitates exchange of product information among global regulators, manufacturers, suppliers, and distributors. While the global adoption of IDMP offers several high-impact benefits, its implementation comes with significant challenges. The Global IDMP Working Group (GIDWG) has been working to address the challenges alongside independent groups and alliances looking to implement IDMP Ontologies in a group of pharma companies.
Another notable development in this evolving regulatory landscape is the PQ/CMC initiative, launched in 2017. FDA has initiated a project aimed at identifying and prioritizing pharmaceutical quality (PQ) and chemistry, manufacturing, and controls (CMC) information that would benefit from a structured submission format. This structured and standardized data is intended for submission within Module 3 of the Common Technical Document (CTD) as defined by the ICH M4 guidelines. It does not initially aim to structure all product quality data within the eCTD but rather to focus on specific priority elements that are well-suited for structuring and that can enhance the quality review process.
PQ/CMC introduces a Body of Knowledge (BOK) that aims to create a structured foundation for regulatory data for CMC information, requiring extensive discussion and alignment at the International Council for Harmonization (ICH) level. This initiative is particularly relevant for structured product quality submissions, which are essential for ensuring consistency and accuracy in regulatory processes across different markets. As the industry increasingly relies on structured submissions, the PQ/CMC BOK will play a key role in harmonizing global standards.
Beyond IDMP and PQ/CMC, other areas like supply chain, digital therapeutics, real-world evidence (RWE), clinical operations, biomarkers, and many others will benefit from similar globalization efforts.
Adhering to these diverse regulatory requirements can be a daunting task, as companies must ensure that their data is collected, coded in the right format, and reported in accordance with each health authority’s requirements. This has led to the development of numerous systems and solutions, not necessarily interconnected. While data standards have evolved to address specific functional needs, we see there are also overlaps, which become challenging to understand and implement effectively. In addition, a lack of data standards expertise continues to be a major obstacle, underscoring the need for skilled professionals to navigate these complex data requirements and drive global collaboration.
Addressing Data Skills, Organizational Structure, and Governance
As organizations increasingly recognize the value of enterprise data in driving business outcomes, it becomes critical for them to develop a data-driven culture. While technology serves as an essential enabler, it alone cannot address the complexity of modern data challenges. Upskilling staff with competencies in data management and evolving technologies such as artificial intelligence (AI) is fundamental to fully leveraging enterprise-wide data for strategic decision-making. Building a data-literate workforce is a critical component of any data strategy. It ensures that employees across all levels, from entry-level staff to senior leadership, have the necessary skills to access, interpret, manage, and utilize data effectively. This goes beyond technical staff, extending to nontechnical, subject matter experts, clinicians, scientists, regulatory professionals, and other roles who need to interpret data insights and leverage AI tools in their everyday tasks.
Moving beyond data skills, the functional structure of an organization can significantly impact how data is managed, governed, and utilized across departments. For example, a centralized data team may possess deep expertise but can become a bottleneck if the rest of the organization is dependent on them for the entire data management, uptake, and insights process. Conversely, a decentralized structure can lead to data silos and inconsistent practices across teams. To strike the right balance, embracing principles from two modern data strategies, data mesh and data fabric, would be a step forward.
Data mesh enables a decentralized approach to data ownership, allowing for greater agility, scalability in data processing, cross-functional collaboration (through data product API), and innovation (by giving domain teams autonomy over their data products).
Data fabric advocates for a unified, integrated data layer to provide a single source of truth for data. This ensures data lineage and transparency, data quality, consistency, and security while allowing different teams to access and manage data easily.
Data Governance is not just about restricting access or maintaining security but also about enabling the appropriate use of data to unlock value for the organization. The governance framework should be agile, allowing the organization to innovate with data while ensuring that ethical and legal boundaries are respected. Governance policies must evolve alongside technological advancements in AI and big data to address issues related to data ownership, quality, compliance, privacy, access, and accountability.
By investing in data skills development, implementing an organizational structure that supports data initiatives, and ensuring sufficient governance, companies can transform data from a raw resource into a strategic asset that drives innovation, enhances operational efficiency, and delivers competitive advantage.
Ongoing Strategic Initiatives
To mitigate the impact of data standards overload, industry and regulators are exploring various strategies:
- International harmonization of data standards: The goal of global organizations like ICH is to standardize the way clinical trial data, pharmacovigilance reports, and other critical data are structured, reported, and submitted across multiple jurisdictions. The use of HL7 FHIR (Fast Healthcare Interoperability Resources) in health data exchange allows for better integration and use of diverse data types while ensuring compliance with regulatory expectations. These adaptable models also reduce the burden of constant rework when new data types or formats are introduced. There is potential for organizations such as DIA to facilitate harmonization, for example by streamlining IDMP and PQ/CMC standards efforts, under ICH or ISO. The recent release of Achieving Excellence in Regulatory Information Management provides an overview of the DIA RIM reference model as a basis for defining and managing RIM data.
- Automation and AI: Using AI-driven solutions to automate the process of data standardization and validation reduces manual effort and the risk of errors. While AI can benefit from common data standards and data quality/governance mechanisms, we believe AI can also help bootstrap the data standardization process by looking across the enterprise and arriving at data models which are reusable, masterable, and extendable. Several pharmaceutical and biotechnology companies are leveraging a combination of Large Language Models (LLMs), Knowledge Graphs (KGs), and traditional AI techniques like Natural Language Processing (NLP) to create sophisticated AI-driven solutions that can augment interpretation and decision-making, leveraging large volumes of data. LLMs are powerful for understanding unstructured data, such as text, documents, and reports, while Knowledge Graphs excel at structuring and connecting this data in meaningful ways. This combination allows organizations to create systems that understand context, relationships, and hierarchies between disparate data sources, significantly enhancing interoperability.
- Collaborative Platforms: In addition to harmonization, there is a growing emphasis on fostering collaborative frameworks between regulators, industry, and other stakeholders to address the data standards overload. For example, initiatives like the TransCelerate BioPharma consortium bring together multiple pharmaceutical companies to collaborate on creating common data standards and operational efficiencies in clinical trials. FDA Digital Transformation Initiatives such as precisionFDA encourage the use of digital tools and data interoperability platforms to enhance how data is collected, shared, and analyzed across different regulatory systems. These efforts are aimed at making it easier for companies to meet compliance requirements while leveraging advanced technologies like AI and blockchain to ensure data integrity and security. Another example is the EU’s exploration of the use of regulatory sandboxes and pilot programs with industry to test and validate new data standards in a controlled environment before widespread implementation.
Other initiatives include EMA’s Innovation Task Force (ITF), which provides a forum for early dialogue on innovative data standards and approaches, allowing for a more flexible regulatory process while maintaining patient safety and data integrity. Multiple initiatives are underway to support cloud-based collaborations between regulators and industry stakeholders to enhance the validation and review of regulatory submissions. These platforms provide secure environments for real-time, interactive collaboration between sponsors and regulators, such as the EMA’s Digital Application Dataset Integration (DADI) initiative, supporting regulatory and scientific assessments with improved efficiency, transparency, and data privacy. Several pilots are underway or in planning to enable industry to transition towards a new way of working.
For example, FDA is enabling a system for collaborative validation of regulatory submissions. This initiative allows both sponsors and health authority reviewers to use a secure cloud environment for eCTD (electronic Common Technical Document) validation and review. This new cloud platform is designed to facilitate interactive, real-time communication between sponsors and regulators through the use of private and shared workspaces, ensuring secure data separation while allowing them to share relevant data only when necessary, maintaining strict data privacy and control. This structure enables more efficient, transparent, secure sponsor-regulator interactions throughout the regulatory review process.
Other collaborative initiatives include pre-competitive collaboration of leading pharmaceutical companies working together to overcome standardization implementation challenges. As an augmentation for ISO-IDMP, the IDMP Ontology provides a digital implementation standard that significantly simplifies the integration and comparability of data from companies and authorities related to all core product information including, e.g., packaging, ingredients, and substances information.
These initiatives in the EU and US enable pharmaceutical companies to experiment with emerging data standards, technologies, and submission processes under the guidance of regulatory authorities, without facing the full burden of compliance.
Call to Action: Invest in Skills for Success
To ensure continued success in this data-driven era, pharmaceutical companies must invest in developing data skills, optimizing organizational structures, and enforcing governance frameworks. By addressing the challenges of data standards overload and leveraging emerging technologies, the industry can unlock the full potential of data to drive innovation, enhance regulatory compliance, and ultimately improve patient outcomes.
Managing this evolving complex landscape will require strategic approaches, including the harmonization of international data standards, leveraging AI and automation for standardization processes, and fostering greater collaboration between regulators and industry stakeholders. Initiatives like the FDA’s Digital Transformation efforts, EMA’s Innovation Task Force, and collaborative validation platforms for eCTD submissions demonstrate how technology can streamline regulatory processes, enhance transparency, and improve communication between sponsors and regulatory authorities.
Conclusion
The pharmaceutical industry will continue to undergo a significant transformation, driven by the increasing emphasis on data-centric submissions, digitalization, and regulatory changes. While the adoption of data standards is critical for improving data quality, interoperability, and regulatory compliance, it will also present new challenges. The overwhelming number of evolving standards, along with overlapping guidelines, has given rise to Data Standards Fatigue, where organizations are burdened by large-scale investments, frequent changes in priority, competition for scarce resources, and constant updates, combined with the fear of noncompliance.
Ultimately, by addressing these challenges with innovative solutions and strategic collaborations, the industry can ensure a more streamlined, transparent, and efficient regulatory ecosystem. This transformation is not only vital for operational success but also for accelerating the delivery of life-changing therapies to patients worldwide, fostering better health outcomes, and reinforcing trust in the global regulatory landscape.