Search Results
Found 8 results
510(k) Data Aggregation
(120 days)
SubtleHD is an image processing software that can be used for image enhancement of all body parts MRI images. It can be used for noise reduction and increasing image sharpness.
SubtleHD is Software as a Medical Device (SaMD) consisting of a software algorithm that enhances images taken by MRI scanners. As it only processes images for the device has no user interface. It is intended to be used by radiologists and technologists in an imaging center, clinic, or hospital. The SubtleHD software can be used with MR images acquired as part of standard of care and accelerated MRI exams as the input. The outputs are the corresponding images with enhanced image quality. Original DICOM images are passed onto the SubtleHD software as an input argument and the enhanced images are saved in the designated location prescribed when running the SubtleHD software. The functionality of SubtleHD (noise reduction and sharpness enhancement) is identified from the DICOM series description and/or through configuration is specified as configuration files and OS environment variables.
SubtleHD software implements an image enhancement algorithm using a convolutional network based filtering. Original images are enhanced by running through a cascade of filter banks, where thresholding and scaling operations are applied. A single neural network is trained for adaptive noise reduction and sharpness increase. The parameters within the neural network were obtained through an image-guided optimization process. Additional nonlocal mean based denoising and unsharp masking based sharpening filters are applied to the deep learning processed image.
The software operates on DICOM files, enhances the images, and sends the enhanced images to any desired destination with an AE Title (e.g., PACS, MR device, workstation, and more). Enhanced images coexist with the original images.
The provided text describes the acceptance criteria and study proving the device, SubtleHD (1.x), meets these criteria.
Here's the breakdown of the requested information:
1. Table of Acceptance Criteria and Reported Device Performance
The provided text includes two distinct sets of acceptance criteria and performance results: one for "Performance Validation" and one for a "Reader Study."
Performance Validation Summary:
| Endpoint | Acceptance Criteria | SubtleHD Mode | Result | Conclusion |
|---|---|---|---|---|
| Denoising (SNR) - Primary | SNR shall improve by at least 40% in homogeneous ROI regions for at least 90% of the dataset. | Default | PASS | SubtleHD performs denoising, in terms of improved SNR, MRI images. |
| SNR shall improve by at least 40% in homogeneous ROI regions for at least 95% of the dataset. | High Denoising | PASS | ||
| Sharpness (Image Intensity Change) - Primary | Slope in a line ROI is increased for at least 90% of the dataset. | Default | PASS | SubtleHD sharpens, in terms of improvement in visibility of the edge at a tissue interface by image intensity slope measure, MRI images. |
| Slope in a line ROI is increased for at least 95% of the dataset. | High Sharpening | PASS | ||
| Sharpness (Image Intensity Change for Brains) - Secondary | Thickness, in terms of FWHM in a line ROI, is reduced for at least 90% of the dataset. | Default | PASS | SubtleHD sharpens, in terms of improvement in visibility of an anatomical structure by image intensity FWHM measure, MRI images. |
| Thickness, in terms of FWHM in a line ROI, is reduced for at least 95% of the dataset. | High Sharpening | PASS | ||
| Sharpness and Over Smoothing (Gradient Entropy) - Primary | At least 90% of cases demonstrate a lower gradient entropy value after SubtleHD processing. | Default | PASS | SubtleHD does not result in over-smoothed images, in terms of improvement in gradient entropy. |
| At least 95% of cases demonstrate a lower gradient entropy value after SubtleHD processing. | High Sharpening | PASS | ||
| There is a statistically significant improvement in gradient entropy when comparing the original and SubtleHD enhanced images across the performance dataset per a two-sided paired t-test. | Default and High Sharpening | PASS |
Reader Study Summary:
| Endpoint | Endpoint Description | Acceptance Criteria | Result | Conclusion |
|---|---|---|---|---|
| Denoising (Primary) | Signal-to-Noise Ratio | Statistically significantly better with p-value < 0.05 or not statistically significantly different in a Wilcoxon signed rank test | PASS | SubtleHD performs denoising, in terms of improved SNR, for MRI Images. |
| Overall Image Quality / Diagnostic Confidence | Statistically significantly better with p-value < 0.05 or not statistically significantly different in a Wilcoxon signed rank test | PASS | ||
| Sharpness (Primary) | Visibility of Small Structures | Statistically significantly better with p-value < 0.05 or not statistically significantly different in a Wilcoxon signed rank test | PASS | SubtleHD performs sharpening and does not over-smooth images, in terms of improved visibility of small structures, for MRI images. |
| Artifacts (Secondary) | Artifact Introduction | SubtleHD-enhanced images do not contain artifacts that could impact diagnosis or b) both input and SubtleHD-enhanced images are deemed to contain artifacts that could impact diagnosis. | PASS | SubtleHD does not introduce artifacts into MRI images. |
2. Sample sizes used for the test set and the data provenance
Standalone Image Quality Metric Testing:
- Paired Test Set (Unaligned SOC as Reference): 97 samples
- Paired Test Set (Unaligned SubtleHD-enhanced SOC as the Reference): 97 samples
- Aligned Test Set: 471 samples
Data Provenance for Standalone Image Quality Metric Testing:
- Countries of Origin: The aligned test set included data from both the US and OUS (Outside US). The paired test set's provenance isn't explicitly stated beyond "More than 50% of data is from sites in the United States."
- Retrospective or Prospective: The text states "retrospective clinical data" for the performance validation and reader study, and describes the data selection for the standalone metrics as using "a subset of our performance validation set," implying a retrospective nature for these as well.
Performance Validation & Reader Study Dataset Characteristics:
- Number of Image Series: 410 image series (205 input and 205 SubtleHD enhanced) for the Reader Study. The Performance Validation used the "SubtleHD performance validation test dataset," which is implied to be distinct from, but shares characteristics with, the reader study dataset.
- Retrospective or Prospective: Retrospective clinical data.
- Countries of Origin: The majority of performance data comes from sources in the United States (65%).
3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts
Performance Validation:
- Ground Truth Establishment: Three Regions of Interest (ROIs) were drawn on each image by a Subtle Medical employee with an MD and/or PhD in a clinically-relevant field.
- Quality Review: A board-certified radiologist reviewed the ROI for acceptability. The exact number of radiologists is not specified, only "a board-certified radiologist."
Reader Study:
- Readers: The study involved "board-certified radiologists." The exact number of radiologists involved in the reading is not specified. Readers were blind to the image processing method.
4. Adjudication method (e.g. 2+1, 3+1, none) for the test set
- Performance Validation: Not applicable in the traditional sense, as ROIs were drawn by one expert and reviewed for acceptability by another. It's not a consensus reading model for diagnostic outcomes.
- Reader Study: The text does not specify an adjudication method for the reader study. It states that "Both the input images and the SubtleHD enhanced images were ranked by board-certified radiologists," implying individual readings without explicit mention of multiple readers reaching consensus or a tie-breaking mechanism.
5. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance
- MRMC Study: A "Reader Study" was conducted, which involved multiple readers (board-certified radiologists) and multiple cases (410 image series). This qualifies as a multi-reader multi-case study.
- Comparative Effectiveness / Effect Size: The study assessed if SubtleHD-enhanced images statistically significantly improved perceived SNR, Overall Image Quality / Diagnostic Confidence, and Small Structure Visibility, and did not introduce artifacts. The results indicate a "PASS" for all these criteria based on a p-value < 0.05 from a Wilcoxon signed rank test, meaning there was a statistically significant improvement or no statistically significant difference for the positive criteria, and no artifact introduction. While it indicates an improvement in perceived image quality, the document does not report specific effect sizes (e.g., AUC uplift, specific improvement percentage in diagnostic accuracy) for how much human readers improve with AI assistance versus without. It focuses on the qualitative assessment of improvement in image characteristics and diagnostic confidence.
6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done
Yes, a standalone performance evaluation was done through:
- Standalone Image Quality Metric Testing: This section reports quantitative metrics (L1 loss, SSIM, PSNR) for the algorithm's output compared to reference images. This is purely algorithm-based performance.
- Performance Validation: This also appears to be a standalone measurement of the algorithm's performance on a dataset, using quantitative metrics like SNR improvement, slope of image intensity change, FWHM reduction, and gradient entropy, without human readers directly influencing the primary outcome measurements.
Both of these sections describe the algorithm's objective performance without a human in the loop to modify or interact with the output for the purpose of the reported metrics.
7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.)
- Standalone Image Quality Metric Testing:
- For the Paired Test Set (Unaligned SOC as Reference), the "standard-of-care (SOC) images" were used as reference (ground truth). Also, "SubtleHD-enhanced SOC" was used as another reference.
- For the Aligned Test Set, "high-quality ground truth images were reconstructed using the vendor's commercially available deep learning pipeline and represent standard-of-care image quality."
- Performance Validation: The ground truth for quantitative metrics (SNR, slope, FWHM, gradient entropy) was derived from characteristics within the images themselves, with ROIs drawn by a clinically-qualified employee and reviewed by a board-certified radiologist for acceptability. This is a form of expert-defined quantitative ground truth.
- Reader Study: The ground truth for this study was the perception of board-certified radiologists using a Likert scale for SNR, Overall Image Quality / Diagnostic Confidence, Small Structure Visibility, and Imaging Artifacts. This can be considered a form of expert consensus/opinion-based ground truth or perceived improvement, although not explicitly stated as a consensus amongst multiple readers for a single case.
8. The sample size for the training set
The document explicitly states that the test sets for all evaluations (Standalone Image Quality Metric Testing, Performance Validation, and Reader Study) were "Not used for algorithm training" and that "data was selected from sources not included in the training dataset (36.50% of the dataset is from non-training sources)." However, the actual sample size of the training set is not provided in the given text.
9. How the ground truth for the training set was established
The document states that a "single neural network is trained for adaptive noise reduction and sharpness increase" and "The parameters within the neural network were obtained through an image-guided optimization process." It also implies "pre- and post-processing is applied to configure desired perceived image quality." However, the specific method for how ground truth was established for the training set is not detailed in the provided text. It is generally understood that for deep learning models like this, the "ground truth" for training purposes would be pairs of original (possibly noisy/blurry) images and corresponding "ideal" or "high-quality" images that the model learns to transform inputs into. The text indicates that some of the performance data came from training sources.
Ask a specific question about this device
(202 days)
AiMIFY is an image processing software that can be used for image enhancement in MRI images. It can be used to increase contrast-to-noise ratio (CNR), contrast enhancement (CEP), and lesion-to-brain ratio (LBR) of enhancing tissue in brain MRI images acquired with a gadolinium-based contrast agent. It is intended to enhance MRI images acquired using standard approved dosage per the contrast agent's instructions for use.
The AiMIFY device is a software as a medical device consisting of a machine learning software algorithm that enhances images taken by MRI scanners. AiMIFY consists of a software algorithm that improves contrast-to-noise ratio (CNR), contrast enhancement (CEP), and lesion-to-brain ratio (LBR) of Gadolinium-Based Contrast Agent (GBCA) enhanced T1-weighted images while maintaining diagnostic performance, using deep learning technology. It is a post-processing software that does not directly interact with the MR scanner and does not have a graphical user interface. It is intended to be used by radiologists in an imaging center, clinic, or hospital. The AiMIFY software uses T1 pre and post-contrast MR images acquired as part of standard of care contrast-enhanced MRI exams as the software input. The outputs are the corresponding images with enhanced contrast presence. AiMIFY enhances DICOM images.
AiMIFY image processing software uses a convolutional network based algorithm to enhance the AiMIFY-contrast images from pre-contrast and standard-dose post-contrast images. The image processing can be performed on MRI images with predefined or specific acquisition protocol settings as follows: gradient echo (pre- and post-contrast), 3D BRAVO (pre- and post-contrast), 3D MPRAGE (preand post-contrast), 2D T1 spin echo (pre- and post-contrast), T1 FLAIR/ inversion recovery spin echo (pre- and post-contrast).
The AiMIFY image is created by AiMIFY and sent back to the picture archiving and communication system (PACS) or other DICOM node by the compatible MDDS for clinical review.
Because the software runs in the background, it has no user interface. It is intended to be used by radiologists in an imaging center, clinic, or hospital.
Note, depending on the functionality of the compatible MDDS, AiMIFY can be used within the facility's network or remotely. The AiMFY device itself is not networked and therefore does not increase the cybersecurity risk of its users. Users are provided cybersecurity recommendations in labeling.
Here's an analysis of the acceptance criteria and the study proving the device meets those criteria, based on the provided text.
Device: AiMIFY (1.x)
Indications for Use: Image processing software for enhancement of MRI images (increase CNR, CEP, LBR of enhancing tissue in brain MRI images acquired with gadolinium-based contrast agent).
1. Acceptance Criteria and Reported Device Performance
Table of Acceptance Criteria and Reported Device Performance:
| Metric | Acceptance Criteria | Reported Device Performance |
|---|---|---|
| Quantitative Assessment | ||
| CNR (Contrast-to-Noise Ratio) Improvement | On average, improved by >= 50% after AiMIFY enhancement compared to traditionally acquired contrast images. | Achieved: 559.94% across all 95 cases; 831.70% for 57 lesion-only cases. Significantly higher than standard post-contrast images (Wilcoxon signed-rank test, p < 0.0001). |
| LBR (Lesion-to-Brain Ratio) Improvement | On average, improved by >= 50% after AiMIFY enhancement compared to traditionally acquired contrast images. (Inferred from primary endpoint definition encompassing CNR, LBR, CEP) | Achieved: 62.07% across all 95 cases; 58.80% for 57 lesion-only cases. Significantly better than standard post-contrast images (Wilcoxon signed-rank test, p-value < 0.0001). |
| CEP (Contrast Enhancement Percentage) Improvement | On average, improved by >= 50% after AiMIFY enhancement compared to traditionally acquired contrast images. (Inferred from primary endpoint definition encompassing CNR, LBR, CEP) | Achieved: 133.29% across all 95 cases; 101.80% for 57 lesion-only cases. Significantly better than standard post-contrast images (Wilcoxon signed-rank test, p-value < 0.0001). |
| Qualitative Assessment (Reader Study) | ||
| Perceived Visibility of Lesion Features (Lesion Contrast Enhancement, Border Delineation, Internal Morphology) | Statistically significantly better for AiMIFY processed images per the Wilcoxon signed-rank test by p < 0.05. | Achieved: Significantly better than standard post-contrast by p < 0.0001 for all three features. |
| Perceived Image Quality and Artifact Presence And Impact On Clinical Diagnosis | NOT statistically significantly worse than standard post-contrast images per the Wilcoxon signed-rank test by p < 0.05. | Achieved: Significantly not worse than standard post-contrast by p < 0.0001. Two of three readers demonstrated Perceived Image Quality is better than standard post-contrast by p < 0.0001. |
| Radiomics Analysis | ||
| CCC for Lesion Tissue (7 feature classes) | >= 0.65 | Achieved: Ranged from 0.68 to 0.89 for lesion tissue. |
| CCC for Parenchyma Tissue (7 feature classes) | >= 0.8 | Achieved: Ranged from 0.82 to 0.92 for parenchyma tissue. |
| SubtleMR Denoising Module Performance | ||
| Visibility of Small Structures | Average scores between original and SubtleMR enhanced images <= 0.5 Likert scale points. | Achieved: Average score difference was 0.05 points. |
| Perceived SNR, Image Quality, Artifacts | Average scores difference between original and SubtleMR enhanced images <= 0.5 Likert scale points. (Measured for Septum Pellucidum, Cranial Nerves, Cerebellar Folia) | Achieved: SNR differences: 0.05 (Septum Pellucidum), 0.08 (Cranial Nerves), 0.07 (Cerebellar Folia). Image quality/diagnostic confidence differences: 0.11 (Septum Pellucidum), 0.04 (Cranial Nerves), -0.05 (Cerebellar Folia). Imaging artifacts differences: 0.11 (Septum Pellucidum), 0.14 (Cranial Nerves), 0.05 (Cerebellar Folia). |
| SNR Improvement from SubtleMR | >= 5% (Acceptance criteria established in SubtleMR validation K223623) | Achieved: Average SNR improvement was 14%. |
2. Sample Size Used for the Test Set and Data Provenance
- Test Set Sample Size: 95 T1 brain cases.
- Of these, 57 cases had identified lesions and were used for lesion-specific analyses (e.g., LBR, lesion-specific CNR).
- Data Provenance: Retrospective, acquired from clinical sites or hospitals.
- Country of Origin: USA (California, New York, Nationwide), Beijing, China.
- Acquisition details: Variety of T1 input protocols (BRAVO, MPRAGE+, FLAIR, FSE), orientations (axial, sagittal, coronal), acquisition types (2D, 3D), field strengths (0.3T, 1.5T, 3.0T), and MR scanner vendors (GE, Philips, Siemens, Hitachi).
- Patient Demographics: Age (7 to 86, relatively even distribution), Sex (relatively even distribution of females and males), Pathologies (Cerebritis, Glioma, Meningioma, Metastases, Multiple Sclerosis, Neuritis, Inflammation, Other tumor related, other abnormalities).
3. Number of Experts Used to Establish Ground Truth for the Test Set and Their Qualifications
- Quantitative Assessment (ROI drawing): One board-certified radiologist.
- Qualitative Assessment (Reader Study): Three board-certified neuro-radiologists.
- Specific years of experience are not mentioned, but "board-certified" implies a certain level of qualification and experience within their specialty.
4. Adjudication Method for the Test Set
- Quantitative Assessment: ROIs were drawn by a single board-certified radiologist. No explicit mention of adjudication or multiple expert consensus for the initial ROI placement. The statistical analysis (Wilcoxon signed-rank test) focuses on the comparison of metrics derived from these ROIs.
- Qualitative Assessment (Reader Study): The readers individually rated images on Likert scales. The results are presented as aggregated statistics (e.g., "significantly better/not worse by p<0.0001"). There is no mention of an adjudication process (e.g., 2+1, 3+1) to arrive at a single consensus ground truth or final rating for each case from the multiple radiologists.
- For exploratory endpoints, such as false lesion analysis, it's mentioned that "100% of cases received scores from all readers that the Standard-of-Care image was sufficient to identify the false lesion(s)," indicating agreement, but this is not a formal adjudication process for establishing ground truth from disagreements.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done
- Yes, a MRMC study was performed. The "Qualitative Assessment (Reader Study)" involved three board-certified neuro-radiologists evaluating cases.
- Effect Size of Human Reader Improvement with AI vs. Without AI Assistance:
- The study design presented is a comparison of standard post-contrast images vs. AiMIFY-enhanced images, evaluated by human readers. It assesses if AiMIFY improves perceived image quality and lesion features.
- The results show improvement in features like "Lesion Contrast Enhancement, Border Delineation, and Internal Morphology" (p < 0.0001 compared to standard post-contrast). Perceived Image Quality was "not worse" and even "better" for two of three readers (p < 0.0001).
- This study directly demonstrates the improvement in image characteristics for human readers when viewing AiMIFY-enhanced images. It does not, however, describe a comparative effectiveness study showing how much human readers' diagnostic accuracy or confidence improves when assisted by AI vs. not assisted. The study focuses on the image enhancement characteristics as perceived by readers rather than a change in diagnostic outcome or reader performance statistics.
6. If a Standalone (Algorithm Only Without Human-in-the-Loop Performance) Was Done
- Yes, a standalone assessment was performed. The "Quantitative Assessment (Bench Test)" evaluated the algorithm's performance directly by comparing calculated metrics (CNR, LBR, CEP) from AiMIFY-processed images against standard post-contrast images. This assessment did not involve human readers' diagnostic interpretation of the images but rather quantifiable improvements generated by the algorithm itself.
7. The Type of Ground Truth Used
- Quantitative Assessment: The ground truth for calculating CNR, LBR, and CEP was based on ROIs drawn by a single board-certified radiologist, identifying enhancing lesions and brain parenchyma. This can be considered a form of expert-defined ground truth based on anatomical and radiological characteristics. The lesions themselves were "identified" in the test datasets, suggesting a pre-existing clinical determination of their presence.
- Qualitative Assessment: The ground truth for "lesion presence" in the Qualitative Assessment was presumably based on cases identified to "have lesions" in the initial test dataset (57 out of 95 cases). The evaluation itself was subjective (Likert scale ratings of perceived visibility, quality, etc.), with readers comparing the standard and AiMIFY images. This relies on the subjective judgment of multiple experts rather than an independent "true" ground truth like pathology.
8. The Sample Size for the Training Set
- The document does not explicitly state the sample size of the training set.
- It mentions that the training and validation datasets were compared for CNR increase, and that the training data compared low-dose to regular-dose post-contrast images, but provides no numerical size for the training set itself.
9. How the Ground Truth for the Training Set Was Established
- The document does not explicitly describe how the ground truth for the training set was established.
- It implies that the training data involved "low-dose to regular-dose post-contrast images," suggesting that perhaps the ground truth for training the enhancement model was the "regular-dose" image, or that the model was trained to transform low-signal images into higher-signal enhanced images. However, specifics on how the "true" enhanced state or lesion characteristics within the training data were determined are not provided.
Ask a specific question about this device
(62 days)
SubtleSYNTH is a software as a medical device consisting of a software machine learning algorithm that synthesizes a SynthSTIR contrast image of a case from T1-weighted and T2-weighted spine MR images.
The SubtleSYNTH device is a software as a medical device consisting of a machine learning software algorithm that synthesizes a SynthSTIR contrast image of a case from T1-weighted and T2-weighted MR images. It is a post-processing software that does not directly interact with the MR scanner. Once a MR scan is acquired, a technologist sends the study from the scanner to a compatible medical device data system (MDDS) via the DICOM protocol. The compatible MDDS, then, makes the images available to SubtleSYNTH for processing.
SubtleSYNTH uses a convolutional network-based algorithm to synthesize an image with desired contrast weighting from other, previously obtained sequences such as T1- and T2-weighted images. The image processing can be performed on MRI images with predefined or specific acquisition protocol settings.
The SynthSTIR image is created by SubtleSYNTH and sent back to the picture archiving and communication system (PACS) or other DICOM node by the compatible MDDS for clinical review.
Because the software runs in the background, it has no user interface. It is intended to be used by radiologists in an imaging center, clinic, or hospital.
Note, depending on the functionality of the compatible MDDS, SubtleSYNTH can be used within the facility's network or remotely. The SubtleSYNTH device itself is not networked and therefore does not increase the cybersecurity risk of its users. Users are provided cybersecurity recommendations in labeling.
Acceptance Criteria and Device Performance Study for SubtleSYNTH (1.x)
This summary outlines the acceptance criteria and the study demonstrating that the SubtleSYNTH (1.x) device meets these criteria, based on the provided 510(k) summary.
1. Table of Acceptance Criteria and Reported Device Performance
| Acceptance Criteria Category | Specific Acceptance Criteria | Reported Device Performance | Study |
|---|---|---|---|
| Quantitative Image Fidelity | Root Mean Square Error (RMSE) between reference STIR and SynthSTIR < a defined threshold. | RMSE = 0.39 (normalized intensity units) | Bench Study |
| Cosine Similarity Matrix elements between reference STIR and SynthSTIR > a defined threshold. | All elements > 0.9 | Bench Study | |
| Bland-Altman analysis for bias (mean intensity difference between SynthSTIR and acquired STIR) for key tissues (Bone, Disc, CSF, Spinal Cord, Fat) showing no significant bias. | Bias samples randomly distributed near zero lines, no trend. 99% CI analysis implies no significant bias. | Bench Study | |
| Interchangeability (Primary Endpoint) | Interchangeability between acquired STIR and SynthSTIR images not significantly greater than 10%. | Study A: Interchangeability = 2.12% (95% CI [-1.31%, 5.88%])Study B: Interchangeability = 0.63% (95% CI [-4.19%, 5.9%]) | Interchangeability Studies (A & B) |
| Interchangeability Sub-analyses | Primary endpoint met for all Primary Categories (Degenerative, Infection, Trauma, Cord Lesion, Non-cord lesion, Vascular, Hemorrhage, Normal) and all scanner vendors. | Primary endpoint met for all Primary Categories and all scanner vendors in both Study A and Study B. | Interchangeability Studies (A & B) |
2. Sample Size for Test Set and Data Provenance
-
Quantitative Bench Study Test Set:
- Sample Size: 80 acquired studies.
- Data Provenance: Retrospective, sourced from clinical sites/hospitals in California, USA, and New York, USA.
- MRI Scanners: GE, Fonar, Philips, Siemens, Toshiba. Field strengths: 0.3T, 0.6T, 1.0T, 1.5T (42 series), 3T (35 series).
- Patient Demographics (of the 80 studies): Ages 16-89 years, 40 females, 36 males, 4 unknown sex. Ethnicity unknown due to anonymization.
- Clinical Categories: 8 cord lesions, 20 degenerative diseases, 10 infections, 15 non-cord lesions, 17 trauma, 10 normal series.
-
Interchangeability Studies (A & B) Test Set:
- Sample Size: 104 cases (common to both studies).
- Data Provenance: Selected from 269 gathered cases, sourced from populations in California, USA, and New York, USA. Retrospective.
- MRI Scanners: GE, Hitachi, Philips, Siemens, Toshiba. Field strengths: 0.3T (6 cases), 1.5T (56 cases), 3T (42 cases).
- Patient Demographics (of the 104 cases): Ages 1-89 years, 51 females, 53 males. Ethnicity unknown due to anonymization.
- Clinical Categories: 12 cord lesions, 12 degenerative disease, 12 hemorrhage, 12 infection, 12 non-cord lesions, 12 normal, 12 vascular, 20 trauma.
3. Number of Experts for Ground Truth and Qualifications
-
Quantitative Bench Study:
- Number of Experts: Not explicitly stated for ROI labeling. However, "an in-house radiologist" assigned clinical categories to collected images.
- Qualifications: "in-house radiologist" (specific experience level not provided in the document).
-
Interchangeability Studies (A & B):
- Number of Experts: Not explicitly stated for ground truth establishment. "an in-house radiologist" assigned clinical categories to the 104 cases for case selection.
- Qualifications: "in-house radiologist" (specific experience level not provided). The studies involved human readers for image interpretation, but their classifications were compared against each other and the "acquired STIR" and "SynthSTIR" images, not against a definitive expert-established ground truth of pathology for reader performance evaluation.
4. Adjudication Method for the Test Set
The document does not explicitly describe an adjudication method (like 2+1, 3+1) for establishing a definitive ground truth for the test set before the readers evaluated the images in the interchangeability studies.
In the interchangeability studies, readers themselves were making classifications, and their consistency across image types (acquired STIR vs. SynthSTIR) was evaluated. The primary endpoint focused on the interchangeability (disagreement rate) between interpretations derived from SynthSTIR versus acquired STIR, rather than comparing reader interpretations to a gold standard ground truth of disease presence/absence.
For the assignment of primary and secondary categories by readers, they were provided recommendations on how to prioritize conditions when more than one was present. There is no mention of an adjudication process if readers disagreed on these classifications when making their assessments.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
No explicit MRMC comparative effectiveness study (comparing human readers with AI assistance vs. without AI assistance) was described.
The "interchangeability studies" (Study A and Study B) involved multiple readers evaluating two different image modalities (SynthSTIR vs. acquired STIR). These studies assessed if readers could make similar classifications using the synthesized images as they could with the acquired images. This is a form of reader study, but it is not framed as an AI-assisted vs. unassisted reader study to determine an effect size of improvement by AI assistance. Instead, it aims to demonstrate that the AI-generated images can be used interchangeably with gold-standard images for diagnostic classification.
6. Standalone Performance Study
Yes, a standalone performance study was done.
The quantitative bench testing section describes an "algorithm only" or "standalone" performance evaluation of SubtleSYNTH. This involved:
- Comparing the SynthSTIR output directly against the acquired STIR for RMSE, cosine similarity, and Bland-Altman analysis.
- This evaluation focused on the intrinsic image quality and fidelity of the synthesized images produced by the algorithm.
7. Type of Ground Truth Used
- Quantitative Bench Study: The "ground truth" for the quantitative assessment was the acquired STIR image from the MRI scanner. The SynthSTIR image was compared against this acquired STIR image (which is considered the clinical standard for STIR imaging).
- Interchangeability Studies (A & B): The "ground truth" for evaluating interchangeability was not a definitive disease diagnosis (e.g., pathology report). Instead, the studies assessed the agreement or interchangeability of classifications made by radiologists when viewing SynthSTIR images compared to when viewing acquired STIR images. An in-house radiologist assigned initial clinical categories to cases for selection, but this wasn't detailed as definitive ground truth for individual lesion diagnosis.
8. Sample Size for the Training Set
- Training Set Sample Size: 424 cases.
9. How the Ground Truth for the Training Set Was Established
The document states that the training dataset consists of "Sag T1w, Sag T2w, and Sag STIR images." It implies that the Sag STIR images serve as the ground truth or target for the SubtleSYNTH algorithm to learn how to synthesize a SynthSTIR image from Sag T1w and Sag T2w inputs.
The training data was collected from a variety of sources:
- MRI Scanners: GE, Hitachi, Philips, and Siemens. Field strengths: 0.7T, 1.2T, 1T, 1.5T (254 cases), 2T, 3T (160 cases).
- Patient Demographics: Ages 14-89 years, 193 females, 176 males, 55 unknown sex. Ethnicity unknown.
- Provenance: Sourced from populations throughout the USA.
The process of "establishing ground truth" for the training set in this context largely means having a reliably acquired and labeled set of "gold standard" STIR images that the model learns to replicate or generate from other input sequences. There is no mention of expert labeling or pathology for individual findings within the training set, rather the acquired STIR image itself acts as the reference truth for the synthesis task.
Ask a specific question about this device
(164 days)
Ask a specific question about this device
(96 days)
SubtlePET is an image processing software intended for use by radiologists and nuclear medicine physicians for transfer, storage, and noise reduction of fluorodeoxyglucose (FDG), amyloid, 18F-DOPA, 18F-DCFPyL, Ga-68 Dotatate, and Ga-68 PSMA radiotracer PET images.
The SubtlePET image processing software reduces noise to increase image quality using a deep neural network-based algorithm.
The software employs a convolutional network-based method in a pixel's neighborhood to generate the value for each pixel. Using a residual learning approach, the software predicts the noise components and structural components. The software separates these components, which enhances the structure while simultaneously reducing the noise.
The workflow of the product can be easily adapted to existing radiology departmental workflow. The product acts as a DICOM node that receives DICOM 3.0 digital medical image data from the modality or another DICOM source, processes the data and then forwards the enhanced study to the selected destination. This destination can be any DICOM node, typically either the PACS system or a specific workstation.
Here's a breakdown of the acceptance criteria and study information for SubtlePET, based on the provided document:
Acceptance Criteria and Device Performance
| Acceptance Criteria Objective | Reported Device Performance |
|---|---|
| Noise reduction to increase image quality in PET scans. | Significant average increase in quantitative metrics for all cases, demonstrating that the software reduced noise in PET scans. |
Study Information
2. Sample size used for the test set and data provenance:
- The document states that the noise reduction bench test utilized "representative cases of human data." However, it does not specify the exact sample size used for this test set.
- The data provenance is described as "human data already gathered under the auspices of IRB-approved clinical protocols." This indicates the data is retrospective and was collected according to ethical guidelines. The country of origin is not explicitly stated.
3. Number of experts used to establish the ground truth for the test set and qualifications of those experts:
- The document does not provide information regarding the number of experts used or their qualifications for establishing ground truth specifically for the test set.
4. Adjudication method for the test set:
- The document does not specify an adjudication method used for the test set.
5. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, and the effect size of how much human readers improve with AI vs without AI assistance:
- The document does not mention a multi-reader multi-case (MRMC) comparative effectiveness study or any effect size related to human reader improvement with/without AI assistance. The performance data focuses on quantitative metrics of noise reduction.
6. If a standalone (i.e., algorithm only without human-in-the-loop performance) was done:
- Yes, a standalone performance assessment was conducted. The "Noise reduction bench test utilizing representative cases of human data" and the reported "significant average increase in quantitative metrics" describe the algorithm's performance independent of a human reader in a diagnostic workflow.
7. The type of ground truth used:
- The document implies a "reference standard" or "gold standard" for noise reduction based on the quantitative metrics. However, it does not explicitly state what this ground truth was (e.g., a "true" noise-free image, or a statistically derived reference). It focuses on the algorithm's ability to reduce noise relative to the input image, rather than diagnosing a condition against a pathology report.
8. The sample size for the training set:
- The document does not specify the sample size used for the training set of the deep neural network.
9. How the ground truth for the training set was established:
- The document does not explicitly describe how ground truth was established for the training set. It mentions the software uses a "deep neural network-based algorithm" that employs a "convolutional network-based method" and a "residual learning approach" to predict noise and structural components. This suggests the training would involve pairs of noisy and "cleaner" or target images, but the exact method for generating or establishing the "cleaner" ground truth is not detailed.
Ask a specific question about this device
(122 days)
SubtleMR is an image processing software that can be used for image enhancement in MRI images. It can be used to reduce image noise for head, spine, nelvis, prostate, breast and musculosketal MRI, or increase image sharpness for head MRI.
SubtleMR is Software as a Medical Device (SaMD) consisting of a software algorithm that enhances images taken by MRI scanners. As it only processes images for the end user, the device has no user interface. It is intended to be used by radiologists in an imaging center, clinic, or hospital. The software can be used with MR images acquired as part of MRI exams on 1.2 Tesla. 1.5 Tesla or 3 Tesla scanners. The device's inouts are standard of care MRI images. The outputs are images with enhanced image quality.
Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:
1. Table of Acceptance Criteria and Reported Device Performance
| Performance Test | Acceptance Criteria | Reported Device Performance |
|---|---|---|
| Noise Reduction | (i) Signal-to-noise ratio (SNR) of a selected region of interest (ROI) in each test dataset is on average improved by greater than or equal to 5% after SubtleMR enhancement compared to the original images.(ii) The visibility of small structures in the test datasets before and after SubtleMR is on average less than or equal to 0.5 Likert scale points (implying minimal visual difference in small structures). | This test passed. |
| Sharpness Enhancement | The thickness of anatomic structure and the sharpness of structure boundaries are improved after SubtleMR enhancement in at least 90% of the test datasets. | This test passed. |
2. Sample Size Used for the Test Set and Data Provenance
The document states that the study "utilized retrospective clinical data." However, it does not explicitly state the sample size for the test set (number of images or patients) or the country of origin of the data.
3. Number of Experts Used and Qualifications of Experts
The document does not explicitly state the number of experts used or their specific qualifications (e.g., "radiologist with 10 years of experience"). It mentions "visibility of small structures" and "thickness of anatomic structure and the sharpness of structure boundaries" were evaluated, implying expert review, but the details are missing.
4. Adjudication Method for the Test Set
The document does not describe any specific adjudication method (e.g., 2+1, 3+1) for establishing the ground truth or evaluating the image quality metrics. It simply states that the tests "passed."
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
The document does not describe a multi-reader multi-case (MRMC) comparative effectiveness study involving human readers with and without AI assistance. The performance tests described focus on objective metrics (SNR) and subjective evaluation of image quality changes by the device, not on reader performance improvement.
6. Standalone (Algorithm Only) Performance
Yes, the performance data presented appears to be a standalone (algorithm only) performance evaluation. The metrics (SNR improvement, visibility of small structures, sharpness of structure boundaries) are directly related to the algorithm's output on images rather than evaluating human reader performance with or without the algorithm.
7. Type of Ground Truth Used
The ground truth used appears to be a combination:
- Objective Measurement: For noise reduction, the "signal-to-noise ratio (SNR) of a selected region of interest (ROI)" was objectively measured.
- Expert Consensus/Subjective Evaluation: For "visibility of small structures" and "thickness of anatomic structure and the sharpness of structure boundaries," a subjective evaluation was conducted using a Likert scale for noise reduction, and a percentage of datasets showing improvement for sharpness enhancement. While not explicitly stated as "expert consensus," these evaluations would typically require trained medical professionals (e.g., radiologists) to perform.
8. Sample Size for the Training Set
The document does not provide the sample size for the training set. It mentions the algorithm uses a "convolutional network-based algorithm" and that "parameters of the filters were obtained through an image-guided optimization process," implying a training phase, but the size is not specified.
9. How the Ground Truth for the Training Set Was Established
The document does not explicitly state how the ground truth for the training set was established. It mentions "image-guided optimization process" to obtain the parameters of the filters, which implies that the training data had some form of "ground truth" to guide the optimization, but the nature of this ground truth (e.g., perfectly noise-free images, perfectly sharp images) and how it was derived is not detailed.
Ask a specific question about this device
(84 days)
SubtleMR is an image processing software that can be used for image enhancement in MRI images. It can be used to reduce image noise for head, spine, neck and knee MRI, or increase image sharpness for non-contrast enhanced head MRI.
SubtleMR is Software as a Medical Device (SaMD) consisting of a software algorithm that enhances images taken by MRI scanners. As it only processes images for the end user, the device has no user interface. It is intended to be used by radiologists in an imaging center, clinic, or hospital. The software can be used with MR images acquired as part of MRI exams on 1.2 Tesla, 1.5 Tesla or 3 Tesla scanners. The device's inputs are standard of care MRI images. The outputs are images with enhanced image quality.
Here's a breakdown of the acceptance criteria and study details for SubtleMR, based on the provided FDA 510(k) summary:
1. Acceptance Criteria and Reported Device Performance
The acceptance criteria are divided into two main performance tests: noise reduction and sharpness increase.
| Performance Metric | Acceptance Criteria | Reported Device Performance |
|---|---|---|
| Noise Reduction Test | ||
| Signal-to-Noise Ratio (SNR) Improvement | SNR of a selected Region of Interest (ROI) in each test dataset is on average improved by ≥ 5% after SubtleMR enhancement compared to the original images. | The study passed this criterion. (Specific average improvement percentage is not detailed in the provided text, just that it passed). |
| Visibility of Small Structures | The visibility of small structures in the test datasets before and after SubtleMR is on average ≤ 0.5 Likert scale points (implying minimal or no degradation, or slight improvement in perception). | The study passed this criterion. (Specific average Likert scale change is not detailed in the provided text, just that it passed). |
| Sharpness Increase Test | ||
| Anatomical Structure Thickness & Boundary Sharpness Improvement | The thickness of anatomic structure and the sharpness of structure boundaries are improved after SubtleMR enhancement in at least 90% of the test datasets. | The study passed this criterion. (Specific percentage of datasets improved is not detailed, just that it passed and met the "at least 90%" threshold). |
2. Sample Size Used for the Test Set and Data Provenance
The exact sample size for the test set is not explicitly stated in the provided document. It refers to "each test dataset" for the noise reduction test and "at least 90% of the test datasets" for the sharpness increase test, indicating multiple datasets were used.
The data provenance is stated as retrospective clinical data. The country of origin is not specified.
3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications of Experts
The document does not specify the number of experts used or their qualifications for establishing the ground truth for the test set.
4. Adjudication Method for the Test Set
The document does not specify an adjudication method (e.g., 2+1, 3+1) for the test set. The evaluation seems to have been based on quantitative metrics (SNR) and a Likert scale assessment, but the process of aggregation or reconciliation if multiple readers were involved is not described.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done
The document does not mention a multi-reader multi-case (MRMC) comparative effectiveness study to assess how much human readers improve with AI vs. without AI assistance. The performance tests described focus on quantitative image quality metrics (SNR, sharpness) and a perceptual assessment of small structures, not a reader study of diagnostic accuracy or efficiency.
6. If a Standalone (Algorithm Only Without Human-in-the-Loop Performance) Was Done
Yes, the described performance tests appear to be standalone (algorithm only) evaluations. The metrics (SNR, Likert scale for structure visibility, and sharpness/thickness improvement percentages) directly assess the output of the algorithm on the images, rather than measuring reader performance with and without the algorithm. The device itself is described as having "no user interface," further suggesting a standalone processing function.
7. The Type of Ground Truth Used
The ground truth for the noise reduction test appears to be derived from a quantitative measurement (SNR) and a perceptual assessment (Likert scale for small structures). For the sharpness increase test, it was based on assessing the improvement in thickness of anatomic structures and sharpness of structure boundaries. These are essentially expert-defined metrics or assessments applied to the processed images, rather than external pathology or outcomes data.
8. The Sample Size for the Training Set
The document does not provide the sample size for the training set. It mentions that the algorithm uses a "convolutional network-based algorithm" whose "parameters... were obtained through an image-guided optimization process," implying a training phase, but the details of the training data are not included in this summary.
9. How the Ground Truth for the Training Set Was Established
The document does not explain how the ground truth for the training set was established. It only states that the "parameters of the filters were obtained through an image-guided optimization process," which is vague regarding the ground truth data used for this optimization. For image enhancement tasks, ground truth often involves pairs of original and "ideal" or "target" enhanced images, or noise-free versions of images, but this is not detailed here.
Ask a specific question about this device
(94 days)
SubtlePET is an image processing software intended for use by radiologists and nuclear medicine physicians for transfer, storage, and noise reduction of fluorodeoxyglucose (FDG) and amyloid PET images (including PET/CT and PET/MRI).
The SubtlePET image processing software reduces noise to increase image quality using a deep neural network-based algorithm.
The software employs a convolutional neural network-based method in a pixel's neighborhood to generate the value for each pixel. Using a residual learning approach, the software predicts the noise components and structural components. The software separates these components, which enhances the structure while simultaneously reducing the noise.
The workflow of the product can be easily adapted to existing radiology departmental workflow. The product acts as a DICOM node that receives DICOM 3.0 digital medical image data from the modality or another DICOM source, processes the data and then forwards the enhanced study to the selected destination. This destination can be any DICOM node, typically either the PACS system or a specific workstation.
Here's a breakdown of the acceptance criteria and study information for SubtlePET based on the provided FDA 510(k) summary:
1. Table of Acceptance Criteria and Reported Device Performance
The document doesn't explicitly state quantitative acceptance criteria or a direct comparison to a specified performace target in a table format. However, it implicitly states the objective is to demonstrate noise reduction and substantial equivalence.
Based on the provided text, the primary stated performance outcome is:
| Acceptance Criteria (Implicit) | Reported Device Performance |
|---|---|
| Noise Reduction: Improve image quality by reducing noise in PET scans. | "The study showed a significant average increase in quantitative metrics for all cases demonstrating that the software reduced noise in PET scans." |
| Substantial Equivalence: Demonstrate safety and effectiveness comparable to the predicate device. | "Based upon the results of this testing, it was determined the SubtlePET performance was substantially equivalent to the predicate device." |
2. Sample Size and Data Provenance for the Test Set
- Sample Size for Test Set: The document mentions "representative cases of human data." It does not specify the exact number of cases or scans used in the noise reduction bench test.
- Data Provenance: "human data already gathered under the auspices of IRB-approved clinical protocols." This implies the data were retrospective and obtained from human subjects under ethical review. The country of origin is not specified.
3. Number of Experts and Qualifications for Ground Truth
The document does not specify the number of experts used to establish ground truth or their qualifications for the test set.
4. Adjudication Method for the Test Set
The document does not specify any adjudication method (e.g., 2+1, 3+1, none) for the test set. The noise reduction bench test appears to be based on quantitative metrics.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
The document does not indicate that a multi-reader multi-case (MRMC) comparative effectiveness study was done. There is no mention of human readers evaluating images with and without AI assistance, nor any effect size reported.
6. Standalone (Algorithm Only) Performance
Yes, a standalone (algorithm only) performance assessment was done. The "Noise reduction bench test" evaluating "quantitative metrics" is an example of an algorithm-only performance assessment, as it focuses on the software's ability to reduce noise based on these metrics, independent of human interpretation.
7. Type of Ground Truth Used
The type of ground truth used for the noise reduction bench test appears to be quantitative metrics for noise reduction, rather than expert consensus, pathology, or outcomes data. The document states "significant average increase in quantitative metrics."
8. Sample Size for the Training Set
The document does not specify the sample size for the training set. It only describes the algorithm's methodology (deep neural network, convolutional neural network) which implies a training phase, but no details about the data used for training.
9. How Ground Truth for the Training Set Was Established
The document does not specify how ground truth for the training set was established. Given the description of the algorithm (predicting noise and structural components), the ground truth for training would likely involve pairs of noisy and "clean" or reference images, or labels indicating noise characteristics, but this is not detailed in the provided text.
Ask a specific question about this device
Page 1 of 1