K Number
K251096
Device Name
PeekMed web
Manufacturer
Date Cleared
2025-07-14

(95 days)

Product Code
Regulation Number
892.2050
Panel
RA
Reference & Predicate Devices
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

PeekMed web is a system designed to help healthcare professionals carry out pre-operative planning for several surgical procedures, based on their imported patients' imaging studies. Experience in usage and a clinical assessment is necessary for the proper use of the system in the revision and approval of the output of the planning. The multi-platform system works with a database of digital representations related to surgical materials supplied by their manufacturers.

This medical device consists of a decision support tool for qualified healthcare professionals to quickly and efficiently perform the pre-operative planning for several surgical procedures, using medical imaging with the additional capability of planning the 2D or 3D environment. The system is designed for the medical specialties within surgery and no specific use environment is mandatory, whereas the typical use environment is a room with a computer. The patient target group is adult patients who have an injury or disability diagnosed previously. There are no other considerations for the intended patient population.

Device Description

PeekMed web is a system designed to help healthcare professionals carry out pre-operative planning for several surgical procedures, based on their imported patients' imaging studies. Experience in usage and a clinical assessment are necessary for the proper use of the system in the revision and approval of the output of the planning.

The multi-platform system works with a database of digital representations related to surgical materials supplied by their manufacturers.

As the PeekMed web is capable of representing medical images in a 2D or 3D environment, performing relevant measurements on those images, and also capable of adding templates, it can then provide a total overview of the surgery. Being software, it does not interact with any part of the body of the user and/or patient.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device's performance, based on the provided FDA 510(k) clearance letter:

1. Table of Acceptance Criteria and Reported Device Performance

The document does not explicitly state the "reported device performance" against each acceptance criterion. It only states that the comparison of efficacy results met the acceptance criteria. Thus, the "Reported Device Performance" column reflects this qualitative statement.

ML ModelAcceptance CriteriaReported Device Performance
SegmentationDICE is no less than 90%
HD-95 is no more than 8
STD DICE is between +/- 10%
Precision is more than 85%
Recall is more than 90%Comparison of the efficacy results using the testing and external validation datasets against the predefined ground truth met the acceptance criteria for ML model performance, demonstrating substantial equivalence.
LandmarkingMRE is no more than 7mm
STD MRE is between +/- 5mmComparison of the efficacy results using the testing and external validation datasets against the predefined ground truth met the acceptance criteria for ML model performance, demonstrating substantial equivalence.
ClassificationAccuracy is no less than 90%.
Precision is no less than 85%
Recall is no less than 90%
F1 score is no less than 90%Comparison of the efficacy results using the testing and external validation datasets against the predefined ground truth met the acceptance criteria for ML model performance, demonstrating substantial equivalence.
DetectionMAP is no less than 90%.
Precision is no less than 85%
Recall is no less than 90%Comparison of the efficacy results using the testing and external validation datasets against the predefined ground truth met the acceptance criteria for ML model performance, demonstrating substantial equivalence.

Note: The document only confirms that the performance met the criteria, not the exact values achieved.

2. Sample Sizes and Data Provenance

  • Test Set Sample Sizes (External Validation Data):
    • Segmentation ML model: 402 unique datasets
    • Landmarking ML model: 367 unique datasets
    • Classification ML model: 347 unique datasets
    • Detection ML model: 198 unique datasets
  • Data Provenance: The document states that ML models were developed with datasets from "multiple sites." It doesn't specify the country of origin but mentions that the development, training, and testing data, as well as the external validation data, were designed to cover the intended use population while ensuring variety and diverse patient characteristics. It implies the data is retrospective as it refers to "datasets" collected for model development and validation.

3. Number of Experts and Qualifications for Ground Truth

The document does not specify the number of experts or their qualifications for establishing the ground truth for the test set. It only mentions that the "External validation...was collected independently of the development data to prevent bias, ensuring the reliability of the results. For the external validation, a fully independent dataset, labeled by a separate team, was employed..." The qualifications of this "separate team" are not detailed.

4. Adjudication Method for the Test Set

The document does not explicitly describe an adjudication method (e.g., 2+1, 3+1). It states that the external validation dataset was "labeled by a separate team." This suggests a single labeling event by that team, rather than an explicit multi-reader adjudication process.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

No, an MRMC comparative effectiveness study comparing human readers with AI assistance versus without AI assistance was not reported. The study focuses purely on the standalone performance of the ML models against established ground truth.

6. Standalone (Algorithm Only) Performance

Yes, a standalone performance evaluation of the algorithm (ML models) was done. The performance metrics (DICE, HD-95, MRE, Accuracy, Precision, Recall, F1 score, MAP) and acceptance criteria are applied directly to the output of the ML models.

7. Type of Ground Truth Used

The type of ground truth used is referred to as "predefined ground truth" which was established through a "truthing process" and labeled by a "separate team." While it doesn't explicitly state "expert consensus" or "pathology," for image segmentation and landmarking in medical imaging, ground truth is typically established by trained human experts (e.g., radiologists, orthopedic surgeons, or technicians with specific training) through manual annotation or expert review, which often involves some form of consensus. For classification and detection tasks, ground truth similarly relies on definitive labels provided by experts or established from patient records/outcomes, though the document does not elaborate on the specific method for each ML model type.

8. Sample Size for the Training Set

The training set comprised 80% of the total datasets available for ML model development, which included:

  • Total X-rays: 2852
  • Total CT scans: 1903
  • Total MRIs: 151

Therefore, the approximate sample sizes for the training set are:

  • X-rays: 0.80 * 2852 = 2281.6 (approx. 2282)
  • CT scans: 0.80 * 1903 = 1522.4 (approx. 1522)
  • MRIs: 0.80 * 151 = 120.8 (approx. 121)

9. How the Ground Truth for the Training Set Was Established

The document states, "ML models were developed with datasets from multiple sites... We trained the ML models with 80% of the datasets, developed with 10%, and tested with the remaining 10%." It also mentions that "the validation dataset...has never been used for the algorithm training or for tuning the algorithm, and leakage between development and validation data sets did not occur."

While the process for the training set's ground truth is not explicitly detailed in the same way as the external validation "labeled by a separate team," it is implicitly established through the "development" process. Typically, for ML models of this nature, ground truth for training data would also be established through manual annotation by qualified personnel (e.g., clinicians, trained annotators) following established protocols. The document's emphasis on data independence for external validation suggests that the development/training data was also accurately labeled for its purpose.

§ 892.2050 Medical image management and processing system.

(a)
Identification. A medical image management and processing system is a device that provides one or more capabilities relating to the review and digital processing of medical images for the purposes of interpretation by a trained practitioner of disease detection, diagnosis, or patient management. The software components may provide advanced or complex image processing functions for image manipulation, enhancement, or quantification that are intended for use in the interpretation and analysis of medical images. Advanced image manipulation functions may include image segmentation, multimodality image registration, or 3D visualization. Complex quantitative functions may include semi-automated measurements or time-series measurements.(b)
Classification. Class II (special controls; voluntary standards—Digital Imaging and Communications in Medicine (DICOM) Std., Joint Photographic Experts Group (JPEG) Std., Society of Motion Picture and Television Engineers (SMPTE) Test Pattern).