FDA 510(k) Search - By Innolitics (SaMD and AI Experts)

The Galen ES is intended to assist in the movement of the specified Integra Jako Micro Laryngeal Alligator Forceps to a surgeon-selected position within a surgical field and remain in place until moved elsewhere by the operator. The Galen ES is indicated to be used in rigid microlaryngeal procedures where the specified Jako Micro Laryngeal Alligator Forceps would be utilized by an experienced trained otolaryngologist performing microlaryngeal surgery in an operating room environment in patients at least 18 years old.

Device Description

The Galen ES (see Figure 1 and Figure 2) is a cooperatively-controlled surgical assistant platform (i.e., designed to be controlled by direct user interaction with the surgical instrument), intended to assist in the manipulation of the Integra Jako Micro Laryngeal Alligator Forceps (Integra Part# 3731042 serrated 9.0mm, straight 9.25 inch (235 mm)) (see Figure 3).

NOTE: The Integra Jako Micro Laryngeal Alligator Forceps will not be provided with the system.

The Galen ES is designed to work with the surgeon by sensing the physical forces and movements applied by the surgeon to the Integra Jako Micro Laryngeal Alligator Forceps connected to the Galen ES and successively moving the instrument in a manner consistent with the forces applied.

AI/ML Overview

Here's a detailed breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:

Acceptance Criteria and Device Performance

1. Table of Acceptance Criteria and Reported Device Performance

Acceptance Criteria	Reported Device Performance
Simulated Use Testing:

Performs as intended for indicated surgery.
Tested in a simulated hospital environment with anatomically relevant model.
Compatibility testing of all indicated instruments.
Human factors/usability evaluation.
Validation of device use by surgeons, including:
- User interface and controller(s)
- Compatibility with surgeon-applied forces, torques, speeds, and motion
- Time required for emergency removal of device and associated instruments in the event of power loss or device failure. | Human Factors Engineering Study (Cadaver):
"Human factors testing demonstrated that all critical tasks were completed, and acceptance criteria were met by all users."
Testing in a simulated operating room environment.
The device is intended to be used with Integra Jako Micro Laryngeal Alligator Forceps (Integra Part# 3731042). Compatibility is implicitly demonstrated by the use of this specific instrument in the study.
Usability validation study performed.
Validation of surgeon use demonstrated:
- Surgeons performed tasks including selecting instruments on UI, working ergonomically, completing workflow, recognizing critical info.
- The device is designed to work with the surgeon by sensing the physical forces and movements applied.
- "On average, surgeon participants (n=15) in the followup usability study required 14 seconds to remove the Jako forceps from the airway... This is considered clinically acceptable because these time frames would be consistent with removal of a manual surgical forceps tool during manual surgery." |
  | Non-clinical Performance Testing:
Hardware and system verification.
Verification of critical parameters (including minimum and maximum forces, torques, speeds, and range of motion). | "The following bench tests were performed to mitigate the risks... Force sensor accuracy, Torque sensor accuracy, System performance verification: Degrees of freedom, o Motion actuation, Limit sensing, Passive instrument holding, Device immobilization, Recovery from unintended power loss."
"All bench testing met acceptance criteria." |
| Software Verification, Validation, and Hazard Analysis:
Performed for all software components. | "The Galen ES Surgical System software was developed in accordance with the following FDA guidance documents and standards... All testing and results met the acceptance criteria and/or standards." (References: FDA Guidance Documents on Software Validation & Content of Premarket Submissions; IEC 62304; ISO 14971) |
| Performance Testing:
Demonstrates electromagnetic compatibility (EMC).
Demonstrates electrical safety.
Demonstrates thermal safety of the device. | "Electromagnetic compatibility (EMC) and electrical safety of the device was evaluated... The following Electrical/ Mechanical/Thermal Safety, and EMC testing has been performed: IEC 60601-1:2005 Ed.3+ A1, IEC 60601-1-2:2014 Ed.4, IEC 60601-1-6:2010 Ed.3+A1, IEC 80601-2-77:2019 Ed.1. All testing and results met the acceptance criteria of the above standards." |
| Sterile Components:
All parts or components entering the sterile field must be demonstrated to be sterile. | "The device's RSA cleaning instructions were validated to achieve a Sterility Assurance Level (SAL) of 10-6. The ISA and surgical drapes are single-use accessories, which are individually packaged and sterilized via Ethylene Oxide (EO) to a SAL of 10-6."
"All testing and results met the acceptance criteria and/or standards." |
| Shelf Life:
Performance testing supports shelf life of sterile components by demonstrating continued sterility, package integrity, and device functionality. | "The purpose of the sterility, reprocessing, packaging, and shelf-life evaluations were to mitigate the risk of infection for the patient... The packaging was validated via a protocol compliant with the requirements of FDA-recognized consensus standards. All testing and results met the acceptance criteria and/or standards." |
| Biocompatibility:
All patient-contacting components of the device must be demonstrated to be biocompatible. | "Biocompatibility testing was not performed because the device is not intended to have either direct or indirect contact with the body." (This implies no patient-contacting parts, thus the criteria is met by not needing testing.) |
| Labeling:
Identification of compatible instruments.
Statement about needed training.
Summary of relevant performance testing.
Reprocessing instructions for reusable components. | "Labeling for the Galen ES device includes the following: Identification of compatible surgical instruments, Recommended training for the safe use of the device, Summary of all relevant performance testing, including simulated use testing, Reprocessing instructions for reusable device components." (The document states labeling includes these, implying compliance.) |

2. Sample Size Used for the Test Set and Data Provenance

The primary test set for demonstrating clinical usability and safety was the Human Factors Engineering (HFE) study:

Sample Size: A total of 47 individuals participated:
- 16 setup operators
- 31 operating surgeons (split into an initial usability study with 16 surgeons and a confirmatory usability study with 15 surgeons).
Data Provenance: The study was conducted in a simulated operating room environment using cadaveric tissue. This indicates prospective data collection for the purpose of the study. The location or country of origin for the study itself is not specified but is implied to be within the regulatory jurisdiction relevant to the De Novo request (likely US).

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

The concept of "ground truth" for this device is primarily derived from successful task completion and expert judgment in the usability study, rather than a diagnostic accuracy assessment against a specific pathology.

Number of Experts:
- 31 operating surgeons participated in the HFE study. Their successful completion of tasks was central to establishing that the device could be used safely and effectively.
- These surgeons likely double as the experts whose performance, when using the device, constitutes the "truth" of its usability.
Qualifications of Experts:
- Setup operators' clinical experience ranged from 0.5 to 41 years.
- Operating surgeons' clinical experience ranged from 1 to 35 years. This implies a range from relatively less experienced to highly experienced otolaryngologists.

4. Adjudication Method for the Test Set

The document does not explicitly describe an adjudication method in the sense of multiple independent reviewers agreeing on a diagnostic label. Instead, the "evaluation" section states:

"Successful performance of a task was defined as follows: Actions performed by the participant were in agreement with the details listed in the expected results column of the study protocol. Said actions were completed on the first attempt performed by the participant."

This suggests a pre-defined protocol-driven assessment of task completion, likely observed and recorded by study proctors or researchers, rather than an expert consensus/adjudication process post-hoc on the "ground truth" of performance. The "ground truth" here is the expected correct action for each task.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

No MRMC study evaluating human diagnostic reading improvement with AI assistance was performed.
This device is a cooperative surgical assistant robot, not an AI-powered diagnostic imaging tool. Its primary function is to assist in instrument manipulation during surgery, not to interpret medical images or data.
The study design focused on human factors, usability, and the device's ability to maintain instrument positioning and stability, which are performance characteristics for a surgical tool, not diagnostic accuracy.
The document mentions "reducing user fatigue and tremor as demonstrated in bench testing" as a benefit, but this is not quantified in the Human Factors study in terms of comparative effectiveness with and without the device in a clinical setting with human subjects. The HFE study focused on safe and effective use of the device.

6. If a Standalone (i.e. algorithm only without human-in-the-loop performance) was done

No standalone "algorithm only" performance study was done in the context of diagnostic performance or decision-making.
The device is explicitly a "cooperative surgical assistant platform" where the "device works in conjunction with the surgeon's movements." It is fundamentally designed for human-in-the-loop operation.
However, extensive "Non-clinical Performance Testing - Bench" was conducted to characterize hardware and system performance (e.g., force/torque sensor accuracy, degrees of freedom, motion actuation, limit sensing). These can be considered standalone technical performance evaluations of the device's physical and computational capabilities without a human actively operating it in a surgical context, but they are not "algorithm-only" performance in the sense of an AI model making independent decisions.

7. The Type of Ground Truth Used

For the Human Factors Engineering (HFE) study, the "ground truth" was established by pre-defined protocol-driven expectations for task completion.

It was not expert consensus on an existing medical condition or pathological finding.
It was based on expected safe and effective actions for device setup, diagnostic testing, procedure execution, device shutdown, and emergency procedures.
For critical tasks, "successful performance of a task was defined as: Actions performed by the participant were in agreement with the details listed in the expected results column of the study protocol AND Said actions were completed on the first attempt performed by the participant."

8. The Sample Size for the Training Set

The document describes training for the users of the device, not a training set for a machine learning algorithm.

The users (surgeons and setup operators) underwent a "training program that mirrored the intended user training that will be provided to customers." This involved a didactic module and a hands-on training module.
No specific sample size for a machine learning training set is mentioned, as the device is not described as an AI/ML-driven diagnostic or decision-making system in the traditional sense. It's a robotic assist platform for physical manipulation guided by the surgeon's forces.

9. How the Ground Truth for the Training Set Was Established

As noted above, there is no mention of a machine learning "training set" with ground truth in the context of a diagnostic AI. The "training" described is for human users.

The "ground truth" for ensuring human users were trained properly was through the training program's curriculum and practice sessions. Participants had to practice tasks to gain familiarity before the study, emphasizing critical tasks. The effectiveness of this training was then assessed by their performance in the validation study against the predefined successful task completion criteria.

Summary

Regulation Number and Section

N/A