AI’s Promise for ATMPs
Industry 4.0 applications in biopharma involve the complete spectrum of data science throughout the entire product life cycle of many disparate entity types. Tools such as digitalization, modern data science, and the industrial internet of things (IIoT) exist now, and examples from other industries such as Siri and Alexa, face identification, and self-driving cars can guide their implementation. While these new technologies have been empowering the pharma industry in, for example, the application of ICH Q10 and Q12, we are seeing a transition to these technologies being required for modern implementations including advanced therapy medicinal products (ATMPs). Industry 4.0 approaches promise benefits to the entire product life cycle, from biology and medicine through product research and development, to validation, manufacturing, and post-launch surveillance.
These digital technologies and applications are in the advanced stages of refinement and are even becoming available in commercialized products and consulting services. Regulators have actively endorsed and encouraged their incorporation. The largest current obstacle to achieving the goal of incorporating digital technologies and applications is convincing all industry stakeholders—operators, departments, and corporate-level executives—of the many benefits of modern data-driven technologies. Among the Industry 4.0 technologies, artificial intelligence (AI) is the most insightful application accelerated by the rest of the digitalization enablers, no doubt because AI has the ability to emulate human cognition.
ATMPs can require increased and/or unique requirements in process design. AI can be used to ensure the quality of the data, as well as the privacy of patients when their sensitive information is used to prepare the therapy. Use of such therapies results in an increased emphasis, even unique demands, on tools ensuring patient privacy and data security throughout data collection, generation, implementation, and storage.
AI’s Promise for ATMPS
Of the many manufacturing digitization enablers, AI-driven applications are the most promising because of their ability to provide considerable insight into the current status and future direction of many systems supporting ATMP process design and manufacturing (Figure 1).
A note about the use of the term ATMPs in this article: Biological medicinal products supporting medicinal therapies based on genes, cell therapy, and engineered tissues are commonly referred to in the US as cellular and gene therapy (C>) products; in Europe, they are called ATMPs. We use the term ATMPs in this article to cover both terms.
The complexity associated with gene and cell biotechnology and the variation inherent to the processes of transforming the biomaterials into personalized drugs are ripe prospects for AI. Some ATMPs such as cell and gene therapy can have certain characteristics that present challenges for traditional process design and manufacturing. One characteristic is patient-distal cell processing, which creates challenges in track and trace and repeated decontamination of incoming process materials. Other challenges include establishing and maintaining patient-related data standards, security, privacy, curation, storage, and distribution; the extreme difficulty in providing second attempts at a therapy, which imposes a burden to not only discover pro-cess deviations and product failure but also the added pressure to prevent them; new and more sophisticated sensing and analysis for the determination of patient-specific sample and product characteristics; autologous sample handling that often requires adjustment of process parameters to best handle unique cell/tissue sample characteristics; the orchestration of distributed control with any centralized processing of individual autologous cells and tissues that imposes added burdens; and the fact that as products and practices are so new, and operational critical process parameters (CPPs) are often not well understood, the production process can evolve even after technology transfer.
In autologous therapies, each patient must be linked to a single batch that has been manufactured and controlled by means of both singularities (specific target) and commonalities (universal specifications). The expected efficacy, quality, and safety of the issued drug under the ATMP framework is even harder to achieve, batch after batch, than in small or even large biological molecules. There is just one reason for this—the uniqueness of each patient causes continuous variability in all the key production factors.
ATMPs can require increased and/or unique requirements in process design.
The rise of computational power and cloud-based storage, and the increasing sophistication of AI algorithms, have enabled an evolution of both AI’s capabilities and ethical implementation, taking into consideration the vulnerability associated with the data used to train automatic learning systems. AI combines advanced multivariable analysis, power computing, and algorithmic procedures to calculate insightful results in complex scenarios. As data is the fuel used to achieve successful outputs, it is part of the AI life cycle and must be considered as an asset by both management and experts who participate in the processes of ATMPs.
Industry 4.0 Tools in Pharma Manufacturing
The dream of Industry 4.0 is becoming realized in many sectors. The utility of AI in medicine includes disease identification and clinical diagnosis, therapeutics and personalized treatment, drug discovery and manufacturing, and more (Table 1). Operations computers have been incorporating higher structures of processing and becoming better connected to analytic output, instrumentation, and data lakes. While delayed, through the prescient efforts of thought leaders and such teams as the ISPE Pharma 4.0™ Special Interest Group, application of such smart factory or Industry 4.0 power in pharma is now becoming a reality.
Table 1: AI applications in pharma.
- Processing biomedical and clinical data
- Enabling personalized/rare disease medicine
- Identifying clinical trial candidates
- Predicting treatment results
- Materials and drug supply chain
- Demand forecasting
- Product development
- Regulatory affairs
- Predictive maintenance
- Intelligent process automation
- Tech transfer
- Production/purification/finish
- Continued verification
- Protecting the supply chain
- Pharmacovigilance
- Risk management plans
- Phase 4 monitoring
- Drug adherence and dosage
One example of higher processing promoting Industry 4.0 goals is the application of recurrent neural networks (RNNs) in real time. RNNs are a type of neural networks that show a behavior similar to the functioning of the human brain, giving rise to predictions through sequential information that other algorithms cannot carry out.
This makes them powerful in the digital conversion of manufacturing operations by enabling such tasks as handwriting and speech recognition online.
The Importance of Data
The manual processes in historical data management created data entry issues, incomplete signoff, missing details, and troublesome access. Although AI’s abilities have evolved, the data needed to feed the algorithms often remain locked in archaic media or siloed IT systems throughout the organization. Once accessible, the onerous process of standardizing and normalizing large datasets through manually directed analysis can prevent an organization from moving forward with AI initiatives.
A very significant need in the pharma industry is leadership in incorporating automated cloud-based software and establishing the standards required for rapidly accessible data structures. This would drive the implementation of powerful goals such as CPV, which requires real-time multivariable analysis to identify manufacturing patterns in real time to get adaptable batches at any time. ICH directives Q8–Q12, which promote innovation and continual improvement and strengthen QA, are now greatly enabled by such existing data science tools.
Data Types
Both product development and manufacturing are becoming increasingly dependent on massive volumes of disparate data. Many ATMPs require the protection of the patient health data involved in preparing such therapies, while also protecting the privacy of patients. Especially in the development phase, laboratory observations and even operating parameter measurements can exist in degrees of inconsistent labeling and structure. Even when working with more mature processes, because of the newness of many therapies and their dependence upon clinical information, important observations may come from disparate sources with inconsistent labeling or structure.
Unstructured data is one of the three main sources of valuable information in product development. The other two sources are structured and hybrid data. Structured data is the easiest to process, as it exists in defined tabular format. Unstructured data is the most difficult to process, because while it may represent accurate measurements or complex pretreatments, it is not organized in a regular, consistent manner. Examples of semi-structured or hybrid data are partially condensed laboratory results from disparate analytics or instrumentation or completed forms from hospitals. The rows, columns, semicolons, periods, and dashes may be clear to a reader, but they are difficult for a computer program to organize. Advancements in AI-enabled intelligent data extraction can reduce the burden on developers by automatically preparing such data for interpretation and analysis.
Related issues include data labeling and annotation. In some projects, the extraction and formatting of semi-structured and inconsistently labeled data take up the majority of the project time. Artificial and augmented intelligence is greatly accelerating the solution to such data handling projects by providing significant advances in establishing both data interoperability and relevance through AI-empowered observability analyzers. The analysis and interpretation of the type and amount of data available today demand a collaboration among senior management, engineers, scientists, and computer systems professionals.
Artificial Intelligence
The timely analysis of the massive amount and types of data being generated in product research and process development is not only beyond human capability, it is also beyond the power of classic computerized statistical and mathematical functions. AI-based solutions in medicinal product design and manufacturing have become a disruptive technology in implementing Industry 4.0 innovations.
AI is becoming even more powerful through advances in its component technologies: neural networks (mimicking human decision-making abilities), genetic algorithms (mimicking the biological evolutionary process), and fuzzy logic (mimicking the human ability to draw conclusions with incomplete or imprecise information).
To complement this classification, one may apply a supervised anomaly detection algorithm, such as isolation forest (an algorithm to identify anomalies), to establish a concordance with the clustering analysis for all categorizations. Applying two independent and complementary classification algorithms can support the creation of a model capable of detecting good and suboptimal batches based on data from monitored in-process variables. Furthermore, this application may uncover subtle relationships between various process variables that would not be obvious to even subject matter experts. For example, a Bayesian network model can point to relationships between particular variables and suboptimal batches.
Machine Learning
Machine learning (ML) is a branch of AI in which models learn automatically from their earlier experience. As the program gains experience, the model learns from that exposure and adjusts to provide better performance. First, a model is trained on a portion of the initial available data set, and then the model is validated on the remaining data. This “trained” model is used to analyze new data.
A machine learning model is an in-silico representation of the patterns and relationships within a set of data. Training a machine learning model involves discovery of governing hierarchies, relationships, and structure within that data. The relationship of the data is then formalized into rules that guide the response to data describing new situations.
Three types of machine learning can be used to improve operations: unsupervised, supervised, and reinforcement. In unsupervised or descriptive learning, the program identifies patterns and categories in the data without feedback from this learning. Because higher or formal labeling of the data is not required, it is generally applied to find unknown clusters, associations, or patterns through similarities and differences between data points. In supervised learning, the machine learning algorithm produces an inferred (mapping) function from labeled sets of input/output pairs in provided training data. It is called supervised learning because it learns from the training dataset, like a teacher supervising (and correcting) the learning. The majority of machine learning applications use supervised learning. In reinforcement learning, the program’s learning is reinforced from the gain or loss in the output results. This enables the program to learn in an interactive environment by trial and error, using feedback from previous actions. Reinforcement learning has been described as using “rewards and punishment” as signals for positive and negative behavior, where the goal is to build a model that will maximize the total cumulative success of the operation. It is now widely used in drug discovery and development, providing valuable results on the use of existing drugs for diseases other than those for which they were originally intended.
Augmented Reality
Augmented reality improves the delivery of sensory input experience by manipulating images, sounds, or context to either enhance defined aspects, or add entirely new elements. It has immediate use in both the design and operation of manufacturing processes, including guiding operators to the right procedures, thus minimizing the risk of unnecessarily manipulating dangerous or critical assets. It also includes ergonomic and efficient process train design, data management, real-time monitoring, machine-to-human communication, and operator training.
Other AI Enablers
Enhanced Industrial Sensors
The type, number, and economy of industrial sensors are advancing dramatically in their capabilities and practical applications—with many businesses, such as electronics and automobile manufacturing, taking advantage of them. It is anticipated that as the critical quality attributes (CQA) of existing ATMPs are better understood, the growing ability to inexpensively measure them in real time will greatly enable advanced control, prediction, and CPV.
Robotics and Autonomation
Some closed self-contained devices such as microbioreactor systems are supporting savings in facility space, capital, labor, media, and consumables. They are now providing integrated metabolite analysis; variable sparging, head gassing, feeding, and temperature control; and even clone selection software. Increased analytics sensor and AI applications is enabling the development of a new category of robotically administered processes and even automated, closed cell processing stations with dynamic controls that maintain tight control of multiple culture parameters and adjust to the disparities of autologous patient samples. Automation of many aspects of the process (including manipulation of samples and in-process work) is being advanced by AI sup-port of cobots and mobile robots.
AI and ATMPS
Pharmaceutical entity design and manufacturing are evolving in many ways, and AI is key in most of them. ATMPs are different from medicines based on chemical or even biological origin. AI is already proving valuable in supporting the manufacturing of classical pharmaceuticals and biopharmaceuticals. It is apparent that the power of AI in ATMP and precision medicine activities (the customization of healthcare to a subgroup or individual) will increase in the near future.
Many ATMP therapies have been proposed to treat heterogeneous disease, which is causing different biological entities, such as genes, proteins, mRNAs, miRNAs, and metabolites, to be examined on a global scale. The human body contains almost 20,000 proteins and protein-coding genes, 30,000 mRNAs, 2000 miRNAs, and over 100,000 metabolites. Not only is analysis of such large numbers of bioentities becoming possible, but a comprehensive understanding of their interplay is as well. We are seeing the possibility of using in-silico modeling to create profiles for identified subpopulations or even individuals. To get an idea of how much information must be managed in the life sciences, note that digitizing a single human genome requires around 3 GB of data to be stored and processed.
Analysis of any single type of omics data (such as genomics, proteomics, and transcriptomics) provides knowledge of some reactive processes. Because biological processes are so interrelated, understanding any single system in the context of the others is important. For example, regulation of protein expression is influenced by more than upregulated mRNA transcription, as this may or may not enhance its target protein expression. This is because of factors such as the protein’s expression being influenced by associated metabolites and miRNA’s ability to silence or degrade mRNAs. A more holistic approach, elucidating the interconnectivity and interdependence of many omics, requires a far more complete and comprehensive integration of multi-omic data. The necessity of managing complex relationships between such networks of multiple dimensions to understand the total reality appears in this context, as it does in, e.g., the earth’s ecosystem.
Many online databases are being created to accommodate such multi-omics data. Integrated multi-omics analysis on large populations provides a path of information flow from one omics data type to another, thereby providing power for analysis supporting therapeutic development.
Comprehensive integration of such multi-omic systems and non-omic data is challenging because of the size, heterogeneity, and complexity of the relationship between such datasets. AI is uniquely qualified to generate any number of models supporting a systems approach to such analysis because of its capabilities to interpret and generate knowledge about complex systems such as multi-omic frameworks.
ATMP production calls for integrated data acquisition throughout the manufacturing chain. Autologous cell therapies are one example, and some proposed gene therapies will be even more dependent on newer data science technologies. Therapeutic oncolytic viruses and RNA vaccines based on the profile of a specific patient’s tumor are an example of patient information being integral to drug manufacturing. An ATMP manufacturer could use an AI-powered model to intervene directly in the product’s nascent application in a patient’s therapy. We can now see that data science not only supports drug manufacturing but also ensures its success in therapeutic application.
AI is supporting such manufacturing operations in many ways. As described above in the section on augmented reality, machine vision is already being effectively used to allow operators to engage in a synthetic experience training of operations technicians in new procedures. In supporting process automation and autonomation, machine vision relieves operators from the drudgery of dirty, dull, and dangerous activities currently done manually. An AI-empowered digital twin of any number of processes or equipment can support activities such as determining the performance of an asset based upon past operation, predicting impending process completion or deviation, or performing a process virtually prior to actual operation in, for example, process development or material QC.
The future of autologous cell therapies appears to be closed, automated “factory-in-a-box” type cell processing. Such systems can reduce contamination, inconsistent sample handling, and human error. AI-empowered automated systems will provide increased reproducibility by ensuring more rapid and robust adjustment of CPPs. They can reduce development time because many distinct processes can be studied in the same research space with minimal concerns for cross-contamination. When increased capacity is required, scaling out can be easily accomplished by multiplying the number of units deployed. It takes significant monitoring, data curation, and dynamic control to coordinate the manipulation of the many unique cell phenotypes throughout the process steps.
Biological Systems Approaches
One goal of systems biology is to model complex biological interactions by comprehensively compiling information from interdisciplinary fields. It has been observed that the 20th century style of reductionism provides much understanding and labeling, of a systems’ parts, but it is unable to complete our understanding of the system or interpret any components or subprocesses that are currently unstructured. Beyond multi-omic analysis, systems, or network, biology is a new approach to understanding the complex interactions of the molecules of life that uses an integrative approach to such complex systems expressing synergistic or emergent behavior.
AI-empowered network biology supports the mapping of the molecular relationships in normal and abnormal phenotypes. It promises to result in a more explicit and deterministic model, providing more predictive, preventive, and personalized medicine. It is impossible to functionally integrate the volume and diversity of multi-omics data by classical methods to produce this more holistic understanding.
Control Systems
Not too long ago, bioreactor process control elements such as tank heaters were automatically PID (proportional–integral–derivative) controlled through output variables such as temperature, while some control variables including glucose or glutamine levels were manually maintained. Progress has been made for years in the development of control strategies, optimization algorithms, and software frameworks for control systems. Current systems often employ supervisory control and data acquisition (SCADA) requiring a human machine interface (HMI).
Newer closed-loop bioreactor control for many variables is now being accomplished with adaptive model-based controllers. They provide a significant and flexible benefit in essentially two ways: (a) they provide for optimized constraints of bioreactor operation and for a constraint of the control signal itself to within optimal ranges; and (b) they modify their action to the result of their control activity and to other changes in the systems in real time. Their power is that they fully recalculate the optimal next step in each monitoring cycle of their operation. Much progress is being accomplished in communication standards for applications such as coordination of distributed control.
While the math for some of these systems has existed for some time, a number of developments in independent fields have actually enabled them. Hardware speed and algorithm diversity now support both linear systems and nonlinear systems in such iterative activities. More accurate and dynamic AI-driven base models are being developed to exploit the more statistically valuable data being supplied by additional monitoring capability, as well as the increasing cell culture systems knowledge. However, parameter estimation can still be based on either a model-free or model-based algorithm.
It is anticipated that AI applications will be powerful in controlling two sources of variability in ATMP: (a) due to variation in autologous sample condition, dynamic assessment control is required, and (b) due to the lack of parameters currently monitored in many of the small-scale processes, unrevealed variability exists in critical process conditions.
Sample Reception and Tracking
For autologous therapies, the reception, cataloging, and governance of hundreds of unique shipments of patient cells require dedicated operations. The clinical activities of blood sampling and cell preparation can be made more robust and safer through improved GxP-documented processing, cataloging, and transport conditions as well as by ensuring the patient’s information remains secure. ATMP processes must ensure product safety and quality, as well as protect the integrity and confidentiality of patient information. The procedures established for organ transplants involve a central database and transport control, but these are too expensive for ATMP therapies. The procedures used in centralized cytogenetic therapy supporting the track and trace of amniocytes for cytogenetic analysis laboratories showcase many of the functions of a cell therapy facility. They must receive, track, and sterilely process thousands of human cell samples per year in a highly regulated enterprise. AI is being used in such systems to not only alert to existing excursions but also anticipate impending errors and prescribe actions to avoid them.
Culture Expansion
Autologous cell therapies use significantly different suites and equipment than that of master and working cell stocks, single-use bioreactor trains, and the individual but related styles of process monitoring and control used in allogeneic cell therapies. The individual origin of the patient-specific cells and the very small-scale cultures (<20 L) are the most significant differences, while the scale-up problem presents challenges of a similar nature. Finally, the now-common use of commercially distributed semi-automated culture instrumentation is unique to cell therapy. The individual performance of such cultures in defined processes, and their genotype-determined response to input variables, determine a need for the heightened dynamic control afforded by AI.
Analytics
The incoming material QC processes are generally similar to those of protein biologicals. However, in-process monitoring, final product QC, and release testing for a living cell product can be unique, ranging from a small molecule to protein biological, and involve such values as cell number, viability, cluster of differentiation (CD) markers, and chimeric antigen receptor (CAR) expression. Quantitation and analysis of viruses and nucleic acids are required for many platforms, and often employ new applications of standard equipment. Furthermore, the number and types of values demanded for pro-cess control are growing. AI-driven models can examine such results in real time, providing immediate advice or action on its curation and transmission as, for instance, in discovering new indications for existing medicines.
Cryopreservation and Shipping
The transfer of most cell therapy products requires cryopreservation. For autologous process trains, this involves the highly controlled dispensing and labeling of the samples in cryobags. For allogeneics, facilities for the semi-automated aliquoting of cell-based products on a large scale have been established. It can be difficult to automate the filling procedure for cryobags of different doses or other characteristics from a single batch. AI has the power to comprehensively monitor the cryopreservation process, compare progress to golden batches, detect impending excursions with enough time in advance in order to fix issues, and predict rates and endpoints to ensure success. For example, when trucks transporting vaccines requiring special cold conditions are monitored and analyzed by AI, systems can recommend the best routes to ensure the needed power supply during the transport.
Establishment of AI
It is imperative for the management of the pharmaceutical industry to become fluent in AI’s capabilities, general industrial potential, specific applications, and sources of applied products. While extremely powerful, simulating human intelligence and responses is difficult. It requires considerable amounts of data, computational capability, and time to train the algorithms. To begin, it is very helpful for an individual, group, or organization to understand its current state in each area or category of potential growth. This also guides progress in the capability to apply it to the manufacturing process. Models have been developed to assist in the assessment of the current state of AI understanding and capability in many specific categories of application, as well as to guide the next steps of development.
For example, the AIO team at Xavier Health has developed an AI Maturity Level Characterization Model that enables the assessment and measurement of (a) the functional AI capability of an organization in defined operations or categories and (b) the current capability to improve in these areas over time. With a few exceptions, the organization is able to assess the practical ability of the human resources, departments, and culture to understand, implement, and operate applied AI tools or instrumentation. Rather than assessing the capability to create or maintain integral AI algorithms, the maturity model qualitatively measures the functional capabilities of an organization to work with AI-empowered tools, processes, and structures.
Managing Evolution of AI in ATMPS
AI application requires multiple technological disciplines and different regulatory angles, including best practices in manufacturing, security, privacy, and ethics. For example, Maryann Conway, cofounder, Kaisura, said, “Conventional layered security defense alone will not solve the unique and distinct vulnerabilities and attack vectors associated with AI and ML data exploits. It takes the concurrent implementation of AI and ML-centric data governance, risk, and compliance protocols and the expertise and guidance of experienced AI security-focused experts.
An AI application only can work when all of the required elements are perfectly harmonized to operate in synchrony under higher control. When AI is applied to ATMPs, the result affects specific patients, integrating multiple potential effects into ATMPs for which AI models have been specifically designed. AI can work in digital twins representing human cell cultures, organs, bodies, health systems, or groups of people with similar disease attributes. The singular goal of this entire process is to save lives and maintain a healthy society.
Managers and C-level executives in the pharmaceutical industry are responsible for understanding the challenge of deploying AI for ATMPs and the implications as outlined in Figure 2. The current digital framework makes it possible for data to lead biopharmaceuticals to a new industrial evolution by means of augmented capabilities based on mathematics. AI is able to identify root causes for problems constituted by multiple dimensions that are difficult to understand from a human perspective. AI is also providing recommendations and predictions or simply recognizing patterns from the intricate relationships of multivariable realities and continuous variability that traditionally has been ignored through static production recipes.
Diverse perspectives that will be governed by this new era of biopharma management include (a) the intrinsic value of every single byte of data used to train AI models, (b) systems to guarantee integrity in the data chain, (c) the required investment in personnel and money, (d) understanding the shifted skills involved in implementing robust data policies, and (e) the mechanisms necessary to establish the full life cycle of Industry 4.0 elements involved in ATMP production.
Business methods evolution is understood as revolution when talking about Industry 4.0 technologies. This concept is magnified in pharmaceutical manufacturing when entertaining processes as complex as ATMPs. This evolution can also be contemplated in managing wars on diseases of the future, where therapeutic and prophylactic entities are enlisted to defeat a multitude of diseases and ensure a next step in societal evolution.
Acknowledgments
The authors gratefully acknowledge the Xavier AIO Team for fruitful discussions and Dr. Sundar Selvatharasu for his expert critical reading of the manuscript.