Steve Tyrrell, Regulatory Compliance Associates Inc., and Alex Reid, Proactive Cyber Security10.16.18
The quest is on for machine learning (ML) to turn raw data into useful medical devices that improve outcomes and reduce the burden on the healthcare system. Supporting, and someday emulating, human thought processes enables ML devices to improve the decision-making process for patients and clinicians. Software designed to continually learn and improve poses both challenges and opportunities. For example, in the not-so-distant future, patients may experience devices such as an intelligent insulin pump that more effectively manages insulin needs in anticipation of a dessert about to be consumed. As ML matures, the possibilities to improve patient care are endless while the challenges are many.
Addressing Challenges of ML Device Development
As device developers seek to harness ML for next generation products, it’s important to address the unique ML challenges in the pre-commercial stage, including:
ML typically uses supervised or unsupervised algorithms to discover patterns in the data to generate actions. In supervised learning, the developer guides the teaching process of the algorithm. This requires a known data set with inputs and outputs to train the machine to make predictions. The developer corrects the machine’s predictions in this learning cycle, and the system learns from the corrections. Natural language processing is an example of this, where the developer enters a sentence, asks the machine what it means, and over time, the machine learns the pattern and consequently makes smarter outputs.
The other type is unsupervised learning, where the developer does not provide teaching guidance along the way. Instead, the machine extracts general rules from the data using mathematical optimization and other techniques. An example involves the condition of peritonitis, a swelling of the peritoneal cavity. The machine takes pictures of the patient cavity and determines if infection is suggested based on its analysis of prior data.
Choosing to use either a supervised or unsupervised ML algorithm typically depends on factors related to the structure and volume of the data and the use case at hand. The developer can introduce errors in the model if the underlying assumptions are untrue. For example, a machine could learn how to visually differentiate between a criminal and business person if given a set of photographs. However, the resulting algorithm would be incorrect when applied to future photos because appearance doesn’t predict criminal behavior.
Managing Validation and Verification in ML
Besides choosing the best algorithm at the onset of new product development, the R&D professional needs to choose the right amount of data to validate the model. Mislabeled data, too little data, and too much data introduces risk into the machine. The risks are based on the type of algorithm in ML.
In supervised learning, the decision tree or statistics are used to teach the machine. It’s validated by using fault tree analysis that pairs with the decision tree to understand if the machine takes the wrong path based on input data. The challenge in validation is mathematically proving the error margin falls within the tolerance originally specified. The math requires an adequate data set where data points can be allocated between the learning the validation samples.
ML can make it difficult to determine appropriate data sizes due to the lack of standards and potential introduction of creative approaches. A developer, for example, might compare previous clinical studies to suggest sample sizes. In-vitro diagnostics validation might require some 450,000 patient data sets for algorithm development and validation to ensure there are ample sample sizes in the fault tree analysis.
In another example, IBM Watson allows developers to choose various AI algorithms. A developer searching for cancer tumors in a biopsy might choose a neural network, which can be difficult to understand and challenging to develop and validate. The neural network is trained using sets of data like a list of blood test results that indicate the patient has a certain number of cancer cells. Or, the algorithm can be trained by supplying it with images of healthy cells and those afflicted with cancer. In this example, the algorithm can be validated by comparing the training data set to a reasonable clinical study, which compares blood test results to correct diagnoses, the developer asserts that the algorithm has been adequately trained.
Algorithms, which are developed by using a pre-developed AI model, can be validated by leveraging the recommendation of the original creator on the amount of data needed to test to meet the desired margin of error.
Another way to determine sample size involves leveraging domain experts such as clinicians, who understand the frequency of all paths in the decision tree based on their knowledge of each tree node and its associated risks.
Minimizing Cybersecurity Risks in ML: Security and Privacy
Developers understand the need for security and privacy in healthcare applications. In ML, a new security risk involves the malicious introduction of bad data into the machine, which can lead to invalid and harmful outputs. Use of ethical hackers, however, can help mitigate the risk of bad data in supervised learning. These hackers specialize in simulating malicious acts that lead to limitations or boundaries on system learning, which ultimately protect against bad data.
The risk of bad data in unsupervised ML can be reduced by buying an established algorithm with embedded mitigation tactics (mathematical, programmatic, etc). However, a thorough review of the algorithm mitigations is necessary by cybersecurity specialists who understand medical devices and unsupervised machine learning algorithms.
Developers have long been wary of privacy issues related to protected health information in cloud applications. Since many ML platforms leverage cloud storage and therefore introduce new risks to the process, it’s important for ML developers to understand how their data is collated with other data sets. This shared data about the patient condition could be combined to violate privacy through a technique called inference by malicious entities. Inference is an approach that combines different innocuous and non-sensitive data to gain sensitive information. Consider the aggregated data for an automobile accident patient. It’s possible an attorney might slice the data and discover information about the victim’s diabetes to blame the patient for the mishap due to a potential diabetic coma. The use of polyinstantiation can mitigate these types of risks by slicing the data into sets for collation, and designing data silos so only the developer knows which piece goes into the algorithm, thereby preventing the disclosure of the entire patient database.
Working with Regulators in ML Technologies
Experienced device developers understand the well-established process for working with regulators and developing submissions. The challenge in ML surrounds the lack of precedence. Regulators are used to working with established frameworks where a consistent set of inputs generates a reliable set of outputs, but in ML, the outputs are continuously evolving. Thus, device developers must help regulatory agencies establish ways to assess the safety and effectiveness of products. Some suggested tactics include:
When developing new medical device technologies, regulation compliance, risk identification, and risk management are all equally important. Safety assurance cases are an effective way of helping demonstrate device safety.
Assurance cases have been used successfully by other industries such as avionics to efficiently minimize product risk and expedite government reviews. The assurance case helps reviewers better understand risk management in a regulatory submission and recognize how the sponsor both mitigates risks and reduces the likelihood of a device harming end users.
Safety assurance cases can streamline processes for U.S. Food and Drug Administration (FDA) reviewers by improving their understanding of claims and supporting information, and elucidating the evidence supporting product safety and efficacy. This system is markedly different than the traditional method, which entails presenting FDA reviewers with supporting evidence sans guidance and rationale. Such an approach, however, can be problematic for regulators dealing with new technologies because there may not yet be any applicable review standards in place. The safety assurance case process enables reviewers to follow a structured map that focuses on specific evidence of safety claims, possibly resulting in faster submission evaluations.
Safety assurance cases are similar to legal cases, as they authorize product safety and serving as the logical glue for various parts of the regulatory submission. It is an overarching document that:
Managing Risks—Lessons Learned
Companies considering integrating ML into their products can boost their probability of success by developing a complete ML strategy (as opposed to a piecemeal or single-product approach). Working with experienced ML professionals helps, along with a multidisciplinary ML mindset incorporating R&D, regulatory, software, and cybersecurity expertise. It also is wise to take a studied approach in selecting the ML algorithm, and develop a robust risk management plan that addresses the unique challenges in ML validation and cybersecurity. Most of all, companies should prepare themselves for the challenges surrounding safety and efficacy in the regulatory filing process and be open to novel approaches such as safety assurance cases.
Incorporating ML into medical devices offers life sciences companies unparalleled opportunities to impact health and create sustainable differentiation from competitors. As with any emerging technology, there are risks along the product lifecycle during the R&D, regulatory, and validation processes. A prudent approach involves careful planning combined with a solid risk management strategy that brings in seasoned experts to augment internal capabilities.
Steve Tyrrell is senior director of business development at Regulatory Compliance Associates Inc. He has over 20 years of biomedical leadership experience including hardware, software, chemistry, and data analytics with IVD and medical device companies. Steve has managed development of 10 commercial products from concept to market launch, including three original FDA-cleared products. He has been awarded 14 patents, two NCI SBIR awards for novel technologies, was an NIH principle investigator and grant reviewer, and has authored or co-authored several publications.
Alex Reid is managing director of Proactive Cyber Security. Alex managed development and commercialization of several medical devices with accompanying regulated software suites, development of regulated and non-regulated mobile apps, next-gen telehealth solutions, server-less cloud-based healthcare information systems, and developed deep learning features to augment the capabilities of several healthcare products. He led and conducted numerous information security assessments, penetration tests, application security tests, identity and access management implementations, system hardening, enterprise security architecture and governance initiatives for many healthcare and financial institutions. Alex has been awarded three patents for software algorithms, and is a Certified Information System Security Professional (CISSP), an AWS Cloud Solution Architect Associate and a Project Management Professional.
Addressing Challenges of ML Device Development
As device developers seek to harness ML for next generation products, it’s important to address the unique ML challenges in the pre-commercial stage, including:
- The research phase, where selecting the ML algorithm drives subsequent mitigation considerations
- The building phase, which addresses verification, validation, and risk management concerns specific to ML
- Regulatory processes and the nuances of working with regulators in novel areas of technology
ML typically uses supervised or unsupervised algorithms to discover patterns in the data to generate actions. In supervised learning, the developer guides the teaching process of the algorithm. This requires a known data set with inputs and outputs to train the machine to make predictions. The developer corrects the machine’s predictions in this learning cycle, and the system learns from the corrections. Natural language processing is an example of this, where the developer enters a sentence, asks the machine what it means, and over time, the machine learns the pattern and consequently makes smarter outputs.
The other type is unsupervised learning, where the developer does not provide teaching guidance along the way. Instead, the machine extracts general rules from the data using mathematical optimization and other techniques. An example involves the condition of peritonitis, a swelling of the peritoneal cavity. The machine takes pictures of the patient cavity and determines if infection is suggested based on its analysis of prior data.
Choosing to use either a supervised or unsupervised ML algorithm typically depends on factors related to the structure and volume of the data and the use case at hand. The developer can introduce errors in the model if the underlying assumptions are untrue. For example, a machine could learn how to visually differentiate between a criminal and business person if given a set of photographs. However, the resulting algorithm would be incorrect when applied to future photos because appearance doesn’t predict criminal behavior.
Managing Validation and Verification in ML
Besides choosing the best algorithm at the onset of new product development, the R&D professional needs to choose the right amount of data to validate the model. Mislabeled data, too little data, and too much data introduces risk into the machine. The risks are based on the type of algorithm in ML.
In supervised learning, the decision tree or statistics are used to teach the machine. It’s validated by using fault tree analysis that pairs with the decision tree to understand if the machine takes the wrong path based on input data. The challenge in validation is mathematically proving the error margin falls within the tolerance originally specified. The math requires an adequate data set where data points can be allocated between the learning the validation samples.
ML can make it difficult to determine appropriate data sizes due to the lack of standards and potential introduction of creative approaches. A developer, for example, might compare previous clinical studies to suggest sample sizes. In-vitro diagnostics validation might require some 450,000 patient data sets for algorithm development and validation to ensure there are ample sample sizes in the fault tree analysis.
In another example, IBM Watson allows developers to choose various AI algorithms. A developer searching for cancer tumors in a biopsy might choose a neural network, which can be difficult to understand and challenging to develop and validate. The neural network is trained using sets of data like a list of blood test results that indicate the patient has a certain number of cancer cells. Or, the algorithm can be trained by supplying it with images of healthy cells and those afflicted with cancer. In this example, the algorithm can be validated by comparing the training data set to a reasonable clinical study, which compares blood test results to correct diagnoses, the developer asserts that the algorithm has been adequately trained.
Algorithms, which are developed by using a pre-developed AI model, can be validated by leveraging the recommendation of the original creator on the amount of data needed to test to meet the desired margin of error.
Another way to determine sample size involves leveraging domain experts such as clinicians, who understand the frequency of all paths in the decision tree based on their knowledge of each tree node and its associated risks.
Minimizing Cybersecurity Risks in ML: Security and Privacy
Developers understand the need for security and privacy in healthcare applications. In ML, a new security risk involves the malicious introduction of bad data into the machine, which can lead to invalid and harmful outputs. Use of ethical hackers, however, can help mitigate the risk of bad data in supervised learning. These hackers specialize in simulating malicious acts that lead to limitations or boundaries on system learning, which ultimately protect against bad data.
The risk of bad data in unsupervised ML can be reduced by buying an established algorithm with embedded mitigation tactics (mathematical, programmatic, etc). However, a thorough review of the algorithm mitigations is necessary by cybersecurity specialists who understand medical devices and unsupervised machine learning algorithms.
Developers have long been wary of privacy issues related to protected health information in cloud applications. Since many ML platforms leverage cloud storage and therefore introduce new risks to the process, it’s important for ML developers to understand how their data is collated with other data sets. This shared data about the patient condition could be combined to violate privacy through a technique called inference by malicious entities. Inference is an approach that combines different innocuous and non-sensitive data to gain sensitive information. Consider the aggregated data for an automobile accident patient. It’s possible an attorney might slice the data and discover information about the victim’s diabetes to blame the patient for the mishap due to a potential diabetic coma. The use of polyinstantiation can mitigate these types of risks by slicing the data into sets for collation, and designing data silos so only the developer knows which piece goes into the algorithm, thereby preventing the disclosure of the entire patient database.
Working with Regulators in ML Technologies
Experienced device developers understand the well-established process for working with regulators and developing submissions. The challenge in ML surrounds the lack of precedence. Regulators are used to working with established frameworks where a consistent set of inputs generates a reliable set of outputs, but in ML, the outputs are continuously evolving. Thus, device developers must help regulatory agencies establish ways to assess the safety and effectiveness of products. Some suggested tactics include:
- Build a regulatory affairs team with experience in ML and multidisciplinary functions.
- Conduct early and frequent meetings with regulatory authorities so both sides can learn from each other.
- Find clinical and regulatory information throughout the world that is supportive of the desired goal. If negative information is uncovered, address it rather than ignore it.
- Do not submit a “black box.” Develop ways to communicate how and why a particular result occurred.
- Seek related credible sources, publications, guidance documents and experts, reference them, and utilize them.
- Recognize that regulators are used to understanding the device’s Mechanism of Action. In ML and other novel technologies, it is difficult to describe how the device works, so seek alternatives such as Safety Assurance Cases to help effectively communicate risks and risk management activities.
When developing new medical device technologies, regulation compliance, risk identification, and risk management are all equally important. Safety assurance cases are an effective way of helping demonstrate device safety.
Assurance cases have been used successfully by other industries such as avionics to efficiently minimize product risk and expedite government reviews. The assurance case helps reviewers better understand risk management in a regulatory submission and recognize how the sponsor both mitigates risks and reduces the likelihood of a device harming end users.
Safety assurance cases can streamline processes for U.S. Food and Drug Administration (FDA) reviewers by improving their understanding of claims and supporting information, and elucidating the evidence supporting product safety and efficacy. This system is markedly different than the traditional method, which entails presenting FDA reviewers with supporting evidence sans guidance and rationale. Such an approach, however, can be problematic for regulators dealing with new technologies because there may not yet be any applicable review standards in place. The safety assurance case process enables reviewers to follow a structured map that focuses on specific evidence of safety claims, possibly resulting in faster submission evaluations.
Safety assurance cases are similar to legal cases, as they authorize product safety and serving as the logical glue for various parts of the regulatory submission. It is an overarching document that:
- Presents all claims that can be easily linked with supporting evidence to demonstrate the validity of safety claims
- Is a formal method used to demonstrate the validity of a claim. It is presented as a clear, understandable argument supported by scientific evidence
- Contains arguments based on statistical measurements of the system’s reliability and are grounded in risk-based and scientific methods to help discuss and draw conclusions
- Help to connect the dots in a structured way
- Helps them to see both claims and supporting evidence
- Helps them understand the “big picture”
- Align medical device product development with FDA expectations.
- Help gain faster regulatory approvals. Medical device companies that move toward best practices by leveraging safety assurance case principles can clearly demonstrate product safety in a single document, making it easier for the FDA to review.
- The claim is a statement about a property of the system—typically, contained and/or driven by a requirements specification
- The evidence should provide information demonstrating the validity of the claim. This evidence may include verification and/or validation results including, but not limited to, test data, experiment results, and analysis. The evidence should also address the relevance to the claim, whether the evidence directly supports the claim, and whether it is providing sufficient coverage of the claim
- Arguments should link evidence to the claim and provide a detailed description of what is being proven. The arguments also should identify specific evidence that supports the claim
Managing Risks—Lessons Learned
Companies considering integrating ML into their products can boost their probability of success by developing a complete ML strategy (as opposed to a piecemeal or single-product approach). Working with experienced ML professionals helps, along with a multidisciplinary ML mindset incorporating R&D, regulatory, software, and cybersecurity expertise. It also is wise to take a studied approach in selecting the ML algorithm, and develop a robust risk management plan that addresses the unique challenges in ML validation and cybersecurity. Most of all, companies should prepare themselves for the challenges surrounding safety and efficacy in the regulatory filing process and be open to novel approaches such as safety assurance cases.
Incorporating ML into medical devices offers life sciences companies unparalleled opportunities to impact health and create sustainable differentiation from competitors. As with any emerging technology, there are risks along the product lifecycle during the R&D, regulatory, and validation processes. A prudent approach involves careful planning combined with a solid risk management strategy that brings in seasoned experts to augment internal capabilities.
Steve Tyrrell is senior director of business development at Regulatory Compliance Associates Inc. He has over 20 years of biomedical leadership experience including hardware, software, chemistry, and data analytics with IVD and medical device companies. Steve has managed development of 10 commercial products from concept to market launch, including three original FDA-cleared products. He has been awarded 14 patents, two NCI SBIR awards for novel technologies, was an NIH principle investigator and grant reviewer, and has authored or co-authored several publications.
Alex Reid is managing director of Proactive Cyber Security. Alex managed development and commercialization of several medical devices with accompanying regulated software suites, development of regulated and non-regulated mobile apps, next-gen telehealth solutions, server-less cloud-based healthcare information systems, and developed deep learning features to augment the capabilities of several healthcare products. He led and conducted numerous information security assessments, penetration tests, application security tests, identity and access management implementations, system hardening, enterprise security architecture and governance initiatives for many healthcare and financial institutions. Alex has been awarded three patents for software algorithms, and is a Certified Information System Security Professional (CISSP), an AWS Cloud Solution Architect Associate and a Project Management Professional.