(70 days)
PeekMed web is a system designed to help healthcare professionals carry out pre-operative planning for several surgical procedures, based on their imported patients' imaging studies. Experience in usage and a clinical assessment is necessary for the proper use of the system in the revision and approval of the output of the planning.
The multi-platform system works with a database of digital representations related to surgical materials supplied by their manufacturers.
This medical device consists of a decision support tool for qualified healthcare professionals to quickly and efficiently perform the pre-operative planning for several surgical procedures, using medical imaging with the additional capability of planning the 2D or 3D environment. The system is designed for the medical specialties within surgery and no specific use environment is mandatory, whereas the typical use environment is a room with a computer. The patient target group is adult patients who have an injury or disability diagnosed previously. There are no other considerations for the intended patient population.
PeekMed web is a system designed to help healthcare professionals carry out pre-operative planning for several surgical procedures, based on their imported patients' imaging studies. Experience in usage and a clinical assessment is necessary for the proper use of the system in the revision and approval of the output of the planning.
The multi-platform system works with a database of digital representations related to surgical materials supplied by their manufacturers.
As PeekMed web is capable of representing medical images in a 2D or 3D environment, performing relevant measurements on those images, and also capable of adding templates, it then can perform a total overview of the surgery. Being software it does not interact with any part of the body of the user and/or patient.
Here's a breakdown of the acceptance criteria and study that proves the device meets them, based on the provided text:
1. Table of Acceptance Criteria and Reported Device Performance
The document provides the acceptance criteria but does not directly state the reported device performance metrics from the external validation. It only states that the efficacy results "met the acceptance criteria."
ML Model | Acceptance Criteria | Reported Device Performance (Not explicitly stated in document, only that criteria were met) |
---|---|---|
Segmentation | DICE is no less than 90% | |
HD-95 is no more than 8 | ||
STD DICE is between +/- 10% | ||
Precision is more than 85% | ||
Recall is more than 90% | Met acceptance criteria | |
Landmarking | MRE is no more than 7mm | |
STD MRE is between +/- 5mm | Met acceptance criteria | |
Classification | Accuracy is no less than 90%. | |
Precision is no less than 85% | ||
Recall is no less than 90% | ||
F1 score is no less than 90% | Met acceptance criteria | |
Detection | MAP is no less than 90%. | |
Precision is no less than 85% | ||
Recall is no less than 90% | Met acceptance criteria |
2. Sample Sizes Used for the Test Set and Data Provenance
- Test Set (External Validation Dataset) Sample Sizes:
- Segmentation ML model: 375
- Landmarking ML model: 345
- Classification ML model: 347
- Detection ML model: 198
- Data Provenance: The document states "multiple sites." It does not specify the country of origin. The external validation dataset was collected "independently of the development data to prevent bias" and was a "fully independent dataset." It is not explicitly stated whether the data was retrospective or prospective, but the phrasing "collected independently" for external validation often implies existing, retrospectively collected data used for this specific purpose.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts
The document states that the external validation dataset was "labeled by a separate team" to establish ground truth. It does not provide the number of experts or their specific qualifications (e.g., "radiologist with 10 years of experience").
4. Adjudication Method for the Test Set
The document does not explicitly describe an adjudication method (e.g., 2+1, 3+1). It only states that the ground truth was "labeled by a separate team."
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was Done, If So, What was the Effect Size of How Much Human Readers Improve with AI vs without AI Assistance
No MRMC comparative effectiveness study involving human readers with and without AI assistance is described in the provided text. The study focuses solely on the standalone performance of the ML models against a predefined ground truth. The device is a "decision support tool" requiring "clinical assessment" and "revision and approval of the output of the planning" by healthcare professionals, implying a human-in-the-loop workflow, but no human performance study is detailed.
6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) was Done
Yes, a standalone performance evaluation of the ML models was done. The acceptance criteria and the evaluation against a "predefined ground truth" for Segmentation, Landmarking, Classification, and Detection ML models indicate standalone algorithm performance. The document states that the "efficacy results... met the acceptance criteria for ML model performance."
7. The Type of Ground Truth Used
The ground truth was "predefined ground truth" established by a "separate team" for the external validation dataset. While not explicitly stated as "expert consensus," this typically implies human expert review and labeling given the context of medical imaging and planning. It is not stated as pathology or outcomes data.
8. The Sample Size for the Training Set
The ML models were developed with datasets totaling 2852 CR datasets and 1903 CT scans.
- Training Set: 80% of these datasets were used for training.
- 0.80 * (2852 + 1903) = 0.80 * 4755 = 3804 datasets
9. How the Ground Truth for the Training Set Was Established
The document states that the ML models were "developed with datasets from multiple sites." While it mentions that "External validation datasets were collected independently of the development data... labeled by a separate team," it does not explicitly describe the methodology for establishing ground truth for the training dataset. However, it's generally inferred in such contexts that training data also requires labeled ground truth, likely established by human annotators or experts, but the specifics are not provided in this document.
§ 892.2050 Medical image management and processing system.
(a)
Identification. A medical image management and processing system is a device that provides one or more capabilities relating to the review and digital processing of medical images for the purposes of interpretation by a trained practitioner of disease detection, diagnosis, or patient management. The software components may provide advanced or complex image processing functions for image manipulation, enhancement, or quantification that are intended for use in the interpretation and analysis of medical images. Advanced image manipulation functions may include image segmentation, multimodality image registration, or 3D visualization. Complex quantitative functions may include semi-automated measurements or time-series measurements.(b)
Classification. Class II (special controls; voluntary standards—Digital Imaging and Communications in Medicine (DICOM) Std., Joint Photographic Experts Group (JPEG) Std., Society of Motion Picture and Television Engineers (SMPTE) Test Pattern).