We Are DIA
Automation in Scientific Writing Improving Healthcare Products Communication
Joseph Cheng
Daiichi Sankyo, Inc.

ver the past few decades, advances in technology have resulted in progressive automation in every industry. This trend has also touched the scientific writing community. Programmed automation features can be found even in the world’s most popular word processor and a gold standard in the writing industry, Microsoft WordTM. It is evident that progress in artificial intelligence and deep learning will result in substantial automation of how scientific documents are created.

Benefits and Limitations of Automation

When utilized correctly, automation is a helpful tool in a scientific writer’s arsenal that can make the writing process highly efficient and more accurate. One common and particularly effective application technology is writing by template, in which sections of writing are generated based on content in other sections. In fact, any scientific writing which requires the collection of large amounts of data or highly repetitive work benefits greatly from the implementation of automation.

Examples of the proper implementation of automation in drug development and regulatory affairs include:

  • Study protocols may be automatically populated with text from the investigator’s brochure and protocol concept documents.
  • New Drug Applications (NDAs) require hyperlinks throughout the document, which automation software can easily create.
  • Clinical Study Report (CSR) safety narratives involve the time-consuming process of writing routine text and data. Automation may be especially useful when the quantity of data or patients is high, as is commonly seen in oncology. A hybrid approach where a medical writer leads discussion and assessment of computer-populated data can be both efficient and medically useful.
  • Structured authoring allows writers to focus on the content of their writing by creating a set of rules to standardize the organization of documents.
  • Translation or localization, terms referring to writing in another language, benefit greatly from machine or advanced deep learning methods. However, this requires review by a native speaker to fully capture linguistic and cultural nuances. This is especially true for lay summaries, such as those required by the European Medicines Agency.
  • Clinical and scientific perspectives can technically be generated by modern automation systems but must be assessed by an expert to ensure correct scientific judgment is applied.
  • Automated review tools are a good solution for the tedious process of sending out sequential versions for review and allow for simultaneous commenting and tracking by different team members.

Despite the usefulness of automation, clear limits exist in how the technology can be applied. For example, machine-written texts frequently feel repetitive or unnatural to a reader and may be interpreted as obviously computer-generated unless first assessed properly by a scientific writer. Further, artificial intelligence is currently unable to make consistently correct judgments on scientific topics that require critical thinking. A hybrid approach where a qualified professional reviews the work generated by automated systems may cover these and other shortcomings of modern automation.

Industry Adoption of Automation

Uptake of automation practices in the scientific writing industry has been slow and spotty. Even Microsoft WordTM, for example, has features that are underutilized despite industry’s familiarity with the program. In the past, it was not uncommon for writers to only reluctantly embrace automation due to fears that it would replace their jobs. But over time writers have accepted that new technologies can help increase their capacity to focus on critically important elements of a document by offloading less essential, more “robotic” work to computer software.

Other reasons for the slow adoption of these tools include:

  • Document technology experts are often not writers, and many writers are not sophisticated technology experts.
  • Regulatory agencies seem hesitant to accept the full automation of certain documents, perhaps due to concerns about the trustworthiness of unfamiliar methodologies; in turn, sponsor companies may be unwilling to take on unnecessary compliance risks.
  • New technologies, such as standardized authoring when it was initially pioneered, do not always support industry-standard software such as the Microsoft SuiteTM.
  • It is often difficult to implement new automation technologies in larger and older companies when old systems may need to be scrapped and replaced.

Emerging technologies have not and will not decrease demand for good human scientific writers. However, these technologies are changing their desired skillset, which is evolving to focus more on their ability to intellectually understand data and create a cohesive scientific narrative. Professional organizations such as the American Medical Writers Association (AMWA) and DIA often publish pieces on automation which may be helpful resources for professionals hoping to learn more.

As companies explore implementation of these tools in practice, understanding what types of data sets can benefit from automation will smoothen transitions. Phase 1 studies, for instance, regularly lend themselves well to automation due to the simpler, straightforward nature of their data. Extraordinarily large data sets, such as those found in vaccine trials and phase 3/4 trials, are also well-suited. On the other hand, therapeutic areas like inflammatory and rheumatology disease which can have nonspecific symptoms and preference-based treatment paradigms may not be appropriate for automation. Trials which study small patient populations, as seen in personalized medicine studies, are also ill-suited.

In addition, groups such as the Clinical Data Interchange Standards Consortium (CDISC) have attempted to ease the way forward by creating regulatory body-approved standards for how data are organized and presented. Similar initiatives include the ICH Medical Dictionary for Regulatory Activities (MedDRA); LOINC, the internationally recognized standard database of identifiers for lab tests, vital signs, and clinical documents; and the US National Cancer Institute’s NCI Enterprise Vocabulary Services (EVS).

The Future of Automation

A multitude of companies are exploring novel uses of software in scientific writing. Nonprofit organizations such as TransCelerate have sponsored initiatives to discover how automation can reduce redundancy in research by improving access to studies that are already complete. Real-world data generation, where creative solutions like phone apps are being used to increase the patient-centricity of studies, is another area in which industry is investing. However, the need for improved transparency on how all these technologies work will be greater and greater in order for regulatory agencies to routinely accept the output of these methods.

We can anticipate that future advances in automation will eventually be able to tackle some of the problems that face the entire scientific community. Generally, technological advances lead to a greater degree of transparency around complex information, and scientific data are no exception. For example, consider the possibility of applying deep learning to translate complex research into plain language that the public can clearly understand and combat misinformation and misunderstandings. Automation in writing is making the process of sharing information much faster and more efficient and is leading to better decisions that may aid in curing, preventing, and avoiding disease.

The author thanks Helle Gawrylewski and Nimita Limaye, DIA Global Medical Writing Community, for their contributions to this article.