Search Results
Found 262 results
510(k) Data Aggregation
(210 days)
Ask a specific question about this device
(174 days)
Neurophet AQUA AD Plus is intended for automatic labeling, visualization, and volumetric quantification of segmentable brain structures and lesions, as well as SUVR quantification from a set of MR and PET images. Volumetric measurements may be compared to reference percentile data.
Neurophet AQUA AD Plus is a software device intended for the automatic labeling of brain structures, visualization, and volumetric quantification of segmented brain regions and lesions, as well as standardized uptake value ratio (SUVR) quantification using MR and PET images. The volumetric outcomes are compared to normative reference data to support the evaluation of neurodegeneration and cognitive impairment.
The device is designed to assist physicians in clinical evaluation by streamlining the clinical workflow from patient registration through image analysis, analysis result archiving, and report generation using software-based functionalities. The device provides percentile-based results by comparing an individual's imaging-derived quantitative analysis results to reference populations. Percentile-based results are provided for reference only and are not intended to serve as a standalone basis for diagnostic decision-making. Clinical interpretation must be performed by qualified healthcare professionals.
Here's a breakdown of the acceptance criteria and study details for the Neurophet AQUA AD Plus, based on the provided FDA 510(k) Clearance Letter:
Acceptance Criteria and Device Performance for Neurophet AQUA AD Plus
The Neurophet AQUA AD Plus employs multiple AI modules for automated segmentation and quantitative analysis of brain structures and lesions using MR and PET images. The device's performance was validated against predefined acceptance criteria for each module.
1. Table of Acceptance Criteria and Reported Device Performance
| AI Module | Performance Metric | Acceptance Criteria | Reported Device Performance |
|---|---|---|---|
| T1-SegEngine (T1-weighted structural MRI segmentation) | Accuracy (Dice Similarity Coefficient - DSC) | 95% CI of DSC: [0.750, 0.850] for major cortical brain structures 95% CI of DSC: [0.800, 0.900] for major subcortical brain structures | Cortical Regions: Mean DSC: 0.83 ± 0.04 (95% CI: 0.82–0.84) Subcortical Regions: Mean DSC: 0.87 ± 0.03 (95% CI: 0.86–0.88) |
| Reproducibility (Average Volume Difference Percentage - AVDP) | Equivalence range: 1.0–5.0% for both subcortical and cortical regions | Subcortical Regions: Mean AVDP: 2.50 ± 0.93% (95% CI: 2.26–2.74) Cortical Regions: Mean AVDP: 1.79 ± 0.74% (95% CI: 1.60–1.98) | |
| FLAIR-SegEngine (T2-FLAIR hyperintensity segmentation) | Accuracy (Dice Similarity Coefficient - DSC) | Mean DSC ≥ 0.80 | Mean DSC: 0.90 ± 0.04 (95% CI: 0.89–0.91) |
| Reproducibility (Mean AVDP and Absolute Lesion Volume Difference) | Absolute difference < 0.25 cc Mean AVDP < 2.5% | Mean AVDP: 0.99 ± 0.66% Mean absolute lesion volume difference: 0.08 ± 0.06 cc | |
| PET-Engine (SUVR and Centiloid quantification) | SUVR Accuracy (Intraclass Correlation Coefficient - ICC) | ICC ≥ 0.60 across Alzheimer's-relevant regions (compared to FDA-cleared reference product K221405) | ICC ≥ 0.993 across seven Alzheimer's-relevant regions |
| Centiloid Classification (Kappa value for amyloid positivity) | κ ≥ 0.70 (indicating substantial agreement with consensus expert visual reads) | Kappa values met or exceeded criterion (specific values not provided, but noted as meeting/exceeding) | |
| ED-SegEngine (edema-like T2-FLAIR hyperintensity segmentation) | Accuracy (Dice Similarity Coefficient - DSC) | DSC ≥ 0.70 | Mean DSC: 0.91 ± 0.09 (95% CI: 0.89–0.93) |
| HEM-SegEngine (GRE/SWI hypointense lesion segmentation) | Accuracy (F1-score / DSC) | F1-score ≥ 0.60 | Median F1-score (DSC): 0.860 (95% CI: 0.824–0.902) |
2. Sample Sizes and Data Provenance for the Test Set
- T1-SegEngine (Accuracy): 60 independent T1-weighted MRI cases. Data provenance not explicitly stated, but implicitly from public repositories (e.g., ADNI, AIBL, PPMI) and institutional clinical sites as mentioned for training data, and distinct from training.
- T1-SegEngine (Reproducibility): 60 subjects with paired T1-weighted scans (120 scans total). Data provenance not explicitly stated.
- FLAIR-SegEngine (Accuracy): 136 independent T2-FLAIR cases. Data provenance not explicitly stated, but distinct from training data.
- FLAIR-SegEngine (Reproducibility): Paired T2-FLAIR scans (number not specified). Data provenance not explicitly stated.
- PET-Engine (SUVR accuracy): 30 paired MRI–PET datasets. Data provenance not explicitly stated, but implicitly from multi-center studies including varied tracers and sites.
- PET-Engine (Centiloid classification): 176 paired T1-weighted MRI and amyloid PET scans from ADNI and AIBL. These are public repositories, likely involving diverse geographical data (e.g., USA, Australia). Data is retrospective.
- ED-SegEngine (Accuracy): 100 T2-FLAIR scans collected from U.S. and U.K. clinical sites. Data is retrospective.
- HEM-SegEngine (Accuracy): 106 GRE/SWI scans from U.S. clinical sites. Data is retrospective.
For all modules, validation datasets were fully independent from training datasets at the subject level, drawn from distinct sites and/or repositories where applicable.
The validation cohorts covered adult subjects across a broad age range (approximately 40–80+ years), with both females and males represented.
Racial/ethnic composition included White, Asian, Black, and African American subjects, depending on the underlying public and institutional datasets.
Clinical subgroups included clinically normal, mild cognitive impairment, and Alzheimer's disease for structural, FLAIR, and PET modules, and cerebrovascular/amyloid‑related pathologies for ED‑ and HEM‑SegEngines.
3. Number of Experts and Qualifications for Ground Truth
For structural and lesion segmentation modules (T1-, FLAIR-, ED-, HEM-SegEngines):
- Number of Experts: Not explicitly stated as a specific number, but "subspecialty-trained neuroradiologists" were used.
- Qualifications: "Subspecialty-trained neuroradiologists." Specific years of experience are not mentioned.
For Centiloid classification in the PET-Engine:
- Number of Experts: "Consensus expert visual reads." The exact number isn't specified, but implies multiple experts.
- Qualifications: "Experts" trained in established amyloid PET reading criteria. Specific qualifications beyond "expert" and training in criteria are not detailed.
4. Adjudication Method for the Test Set
For structural and lesion segmentation modules (T1-, FLAIR-, ED-, HEM-SegEngines):
- "Consensus/adjudication procedures and internal quality control to ensure consistency" were used for establishing reference segmentations. The specific 2+1, 3+1, or other detailed method is not provided.
For Centiloid classification in the PET-Engine:
- "Consensus expert visual interpretation" was used. The specific method details are not provided.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
The provided text does not indicate that an MRMC comparative effectiveness study was done to compare human readers with AI assistance versus without AI assistance. The performance studies primarily focus on the standalone (algorithm-only) performance of the device against expert-derived ground truth or a cleared reference product.
6. Standalone (Algorithm-Only) Performance Study
Yes, a standalone (algorithm only without human-in-the-loop performance) study was done for all AI modules. The text explicitly states: "Standalone performance tests were conducted for each module using validation datasets that were completely independent from those used for model development and training." The results presented in the table above reflect this standalone performance.
7. Type of Ground Truth Used
- Expert Consensus:
- For structural and lesion segmentation modules (T1-, FLAIR-, ED-, HEM-SegEngines), reference segmentations were generated by "subspecialty-trained neuroradiologists using predefined anatomical and lesion‑labeling criteria, with consensus/adjudication procedures."
- For Centiloid classification in the PET-Engine, reference labels were derived from "consensus expert visual interpretation using established amyloid PET reading criteria."
- Comparison to Cleared Reference Product:
- For SUVR quantification in the PET-Engine, reference values were obtained from an "FDA‑cleared reference product (K221405)" (Neurophet SCALE PET).
8. Sample Size for the Training Set
The exact sample size for the training set is not explicitly stated as a single number. However, the document mentions:
- "The AI-based modules (T1‑SegEngine, FLAIR‑SegEngine, PET‑Engine, ED‑SegEngine, HEM‑SegEngine) were trained using multi-center MRI and PET datasets collected from public repositories (e.g., ADNI, AIBL, PPMI) and institutional clinical sites."
- "Training data covered:
- Adult subjects across a broad age range (approximately 20–80+ years), with both sexes represented and including multiple racial/ethnic groups (e.g., White, Asian, Black).
- A spectrum of clinical conditions relevant to the intended use, including clinically normal, mild cognitive impairment, and Alzheimer's disease, as well as patients with cerebrovascular and amyloid‑related pathologies for lesion-segmentation modules.
- MRI acquired on major vendor platforms (GE, Siemens, Philips) at 1.5T and 3T... and amyloid PET acquired on multiple PET systems with commonly used tracers (Amyvid, Neuraceq, Vizamyl)."
This indicates a large and diverse training set, although a precise count of subjects or images isn't provided.
9. How the Ground Truth for the Training Set Was Established
The document implies that the training data included "manual labels" as it states: "No images or manual labels from the training datasets were reused in the validation datasets." However, it does not explicitly detail the process by which these "manual labels" or ground truth for the training set were established (e.g., number of experts, qualifications, adjudication method for training data). It's reasonable to infer that similar expert-driven processes were likely used for training ground truth as for validation, but this is not explicitly confirmed in the provided text.
Ask a specific question about this device
(59 days)
AV Vascular is indicated to assist users in the visualization, assessment and quantification of vascular anatomy on CTA and/or MRA datasets, in order to assess patients with suspected or diagnosed vascular pathology and to assist with pre-procedural planning of endovascular interventions.
AV Vascular is a post-processing software application intended for visualization, assessment, and quantification of vessels in computed tomography angiography (CTA) and magnetic resonance angiography (MRA) data with a unified workflow for both modalities.
AV Vascular includes the following functions:
-
Advanced visualization: the application provides all relevant views and interactions for CTA and MRA image review: 2D slides, MIP, MPR, curved MPR (cMPR), stretched MPR (sMPR), path-aligned views (cross-sectional and longitudinal MPRs), 3D volume rendering (VR).
-
Vessel segmentation: automatic bone removal and vessel segmentation for head/neck and body CTA data, automatic vessel centerline, lumen and outer wall extraction and labeling for the main branches of the vascular anatomy in head/neck and body CTA data, semi-automatic and manual creation of vessel centerline and lumen for CTA and MRA data, interactive two-point vessel centerline extraction and single-point centerline extension.
-
Vessel inspection: enable inspection of an entire vessel using the cMPR or sMPR views as well as inspection of a vessel locally using vessel-aligned views (cross-sectional and longitudinal MPRs) by selecting a position along a vessel of interest.
-
Measurements: ability to create and save measurements of vessel and lumen inner and outer diameters and area, as well as vessel length and angle measurements.
-
Measurements and tools that specifically support pre-procedural planning: manual and automatic ring marker placement for specific anatomical locations, length measurements of the longest and shortest curve along the aortic lumen contour, angle measurements of aortic branches in clock position style, saving viewing angles in C-arm notation, and configurable templated
-
Saving and export: saving and export of batch series and customizable reports.
This summarization is based on the provided 510(k) clearance letter for Philips Medical Systems' AV Vascular device.
Acceptance Criteria and Device Performance for Aorto-iliac Outer Wall Segmentation
| Metrics | Acceptance Criteria | Reported Device Performance (Mean with 98.75% confidence intervals) |
|---|---|---|
| 3D Dice Similarity Coefficient (DSC) | > 0.9 | 0.96 (0.96, 0.97) |
| 2D Dice Similarity Coefficient (DSC) | > 0.9 | 0.96 (0.95, 0.96) |
| Mean Surface Distance (MSD) | < 1.0 mm | 0.57 mm (0.485, 0.68) |
| Hausdorff Distance (HD) | < 3.0 mm | 1.68 mm (1.23, 2.08) |
| ∆Dmin (difference in minimum diameter) | > 95% |∆Dmin| < 5 mm | 98.8% (98.3-99.2%) |
| ∆Dmax (difference in maximum diameter) | > 95% |∆Dmax| < 5 mm | 98.5% (97.9-98.9%) |
The reported device performance for all primary and secondary metrics meets the predefined acceptance criteria.
Study Details for Aorto-iliac Outer Wall Segmentation Validation
-
Sample Size used for the Test Set and Data Provenance:
- Sample Size: 80 patients
- Data Provenance: Retrospectively collected from 7 clinical sites in the US, 3 European hospitals, and one hospital in Asia.
- Independence from Training Data: All performance testing datasets were acquired from clinical sites distinct from those which provided the algorithm training data. The algorithm developers had no access to the testing data, ensuring complete independence.
- Patient Characteristics: At least 80% of patients had thoracic and/or abdominal aortic diseases and/or iliac artery diseases (e.g., thoracic/abdominal aortic aneurysm, ectasia, dissection, and stenosis). At least 20% had been treated with stents.
- Demographics:
- Geographics: North America: 58 (72.5%), Europe: 3 (3.75%), Asia: 19 (23.75%)
- Sex: Male: 59 (73.75%), Female: 21 (26.25%)
- Age (years): 21-50: 2 (2.50%), 51-70: 31 (38.75%), >71: 45 (56.25%), Not available: 2 (2.5%)
-
Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications:
- Number of Experts: Three
- Qualifications: US-board certified radiologists.
-
Adjudication Method for the Test Set:
- The three US-board certified radiologists independently performed manual contouring of the outer wall along the aorta and iliac arteries on cross-sectional planes for each CT angiographic image.
- After quality control, these three aortic and iliac arterial outer wall contours were averaged to serve as the reference standard contour. This can be considered a form of consensus/averaging after independent readings.
-
Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study:
- The provided document does not indicate that a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was done to measure human reader improvement with AI assistance. The study focused on the standalone performance of the AI algorithm compared to an expert-derived ground truth.
-
Standalone (Algorithm Only Without Human-in-the-Loop Performance):
- Yes, the performance data provided specifically describes the standalone performance of the AI-based algorithm for aorto-iliac outer wall segmentation. The algorithm's output was compared directly against the reference standard without human intervention in the segmentation process.
-
Type of Ground Truth Used:
- Expert Consensus/Averaging: The ground truth was established by averaging the independent manual contouring performed by three US-board certified radiologists.
-
Sample Size for the Training Set:
- The document states that the testing data were independent of the training data and that developers had no access to the testing data. However, the exact sample size for the training set is not specified in the provided text.
-
How the Ground Truth for the Training Set Was Established:
- The document implies that training data were used, but it does not describe how the ground truth for the training set was established. It only ensures that the testing data did not come from the same clinical sites as the training data and that algorithm developers had no access to the testing data.
Ask a specific question about this device
(122 days)
AI-Rad Companion Brain MR is a post-processing image analysis software that assists clinicians in viewing, analyzing, and evaluating MR brain images.
AI-Rad Companion Brain MR provides the following functionalities:
• Automated segmentation and quantitative analysis of individual brain structures and white matter hyperintensities
• Quantitative comparison of each brain structure with normative data from a healthy population
• Presentation of results for reporting that includes all numerical values as well as visualization of these results
AI-Rad Companion Brain MR runs two distinct and independent algorithms for Brain Morphometry analysis and White Matter Hyperintensities (WMH) segmentation, respectively. In overall, comprises four main algorithmic features:
• Brain Morphometry
• Brain Morphometry follow-up
• White Matter Hyperintensities (WMH)
• White Matter Hyperintensities (WMH) follow-up
The feature for Brain Morphometry is available since the first version of the device (VA2x), while segmentation of White Matter Hyperintensities was added since VA4x and the follow-up analysis for both is available since VA5x. The brain morphometry and brain morphometry follow-up feature have not been modified and remain identical to previous VA5x mainline version.
AI-Rad Companion Brain MR VA60 is an enhancement to the predicate, AI-Rad Companion Brain MR VA50 (K232305). Just as in the predicate, the brain morphometry feature of AI-Rad Companion Brain MR addresses the automatic quantification and visual assessment of the volumetric properties of various brain structures based on T1 MPRAGE datasets. From a predefined list of brain structures (e.g. Hippocampus, Caudate, Left Frontal Gray Matter, etc.) volumetric properties are calculated as absolute and normalized volumes with respect to the total intracranial volume. The normalized values are compared against age-matched mean and standard deviations obtained from a population of healthy reference subjects. The deviation from this reference population can be visualized as 3D overlay map or out-of-range flag next to the quantitative values.
Additionally, identical to the predicate, the white matter hyperintensities feature addresses the automatic quantification and visual assessment of white matter hyperintensities on the basis of T1 MPRAGE and T2 weighted FLAIR datasets. The detected WMH can be visualized as a 3D overlay map and the quantification in count and volume as per 4 brain regions in the report.
Here's a structured overview of the acceptance criteria and study details for the AI-Rad Companion Brain MR, based on the provided FDA 510(k) clearance letter:
Acceptance Criteria and Reported Device Performance
| Acceptance Criteria | Reported Device Performance (AI-Rad Companion Brain MR WMH Feature) | Reported Device Performance (AI-Rad Companion Brain MR WMH Follow-up Feature) |
|---|---|---|
| WMH Segmentation Accuracy | Pearson correlation coefficient between WMH volumes and ground truth annotation: 0.96Interclass correlation coefficient between WMH volumes and ground truth annotation: 0.94Dice score: 0.60F1-score: 0.67Detailed Dice Scores for WMH Segmentation:Mean: 0.60Median: 0.62STD: 0.1495% CI: [0.57, 0.63]Detailed ASSD Scores for WMH Segmentation:Mean: 0.05Median: 0.00STD: 0.1595% CI: [0.02, 0.08] | |
| New or Enlarged WMH Segmentation Accuracy (Follow-up) | Pearson correlation coefficient between new or enlarged WMH volumes and ground truth annotation: 0.76Average Dice score: 0.59Average F1-score: 0.71Detailed Dice Scores for New/Enlarged WMH Segmentation (by Vendor - Siemens, GE, Philips):Siemens: Mean 0.64, Med 0.67, STD 0.15, 95% CI [0.60, 0.69]GE: Mean 0.56, Med 0.60, STD 0.14, 95% CI [0.51, 0.61]Philips: Mean 0.55, Med 0.59, STD 0.16, 95% CI [0.50, 0.61]Detailed ASSD Scores for New/Enlarged WMH Segmentation (by Vendor - Siemens, GE, Philips):Siemens: Mean 0.02, Med 0.00, STD 0.06, 95% CI [0.00, 0.04]GE: Mean 0.09, Med 0.01, STD 0.23, 95% CI [0.03, 0.19]Philips: Mean 0.04, Med 0.00, STD 0.11, 95% CI [0.00, 0.08] |
Study Details
-
Sample Size Used for the Test Set and Data Provenance:
- White Matter Hyperintensities (WMH) Feature: 100 subjects (Multiple Sclerosis patients (MS), Alzheimer's patients (AD), cognitive impaired (CI), and healthy controls (HC)).
- White Matter Hyperintensities (WMH) Follow-up Feature: 165 subjects (Multiple Sclerosis patients (MS) and Alzheimer's patients (AD)).
- Data Provenance: Data acquired from Siemens, GE, and Philips scanners. Testing data had balanced distribution with respect to gender and age of the patient according to target patient population, and field strength (1.5T and 3T). This indicates a retrospective, multi-vendor, multi-national (implied by vendor diversity) dataset.
-
Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications:
- Number of Experts: Three radiologists.
- Qualifications: Not explicitly stated beyond "radiologists." It is not specified if they are board-certified, or their years of experience.
-
Adjudication Method for the Test Set:
- For each dataset, three sets of ground truth annotations were created manually.
- Each set was annotated by a disjoint group consisting of an annotator, a reviewer, and a clinical expert.
- The clinical expert was randomly assigned per case to minimize annotation bias.
- The clinical expert reviewed and corrected the initial annotation of the changed WMH areas according to a specified annotation protocol. Significant corrections led to re-communication with the annotator and re-review.
- This suggests a 3+1 Adjudication process, where three initial annotations are reviewed by a clinical expert.
-
If a Multi Reader Multi Case (MRMC) Comparative Effectiveness Study Was Done:
- No, an MRMC comparative effectiveness study comparing human readers with and without AI assistance was not done. The study focuses on the standalone performance of the AI algorithm against expert ground truth.
-
If a Standalone (i.e. algorithm only without human-in-the loop performance) Was Done:
- Yes, a standalone performance study was done. The "Accuracy was validated by comparing the results of the device to manual annotated ground truth from three radiologists." This evaluates the algorithm's performance directly.
-
The Type of Ground Truth Used:
- Expert Consensus / Manual Annotation: The ground truth for both WMH and WMH follow-up features was established through "manual annotated ground truth from three radiologists" and involved a "standard annotation process" with annotators, reviewers, and clinical experts.
-
The Sample Size for the Training Set:
- The document states that the "training data used for the fine tuning the hyper parameters of WMH follow-up algorithm is independent of the data used to test the white matter hyperintensity algorithm follow up algorithm." However, the specific sample size for the training set is not provided in the given text.
-
How the Ground Truth for the Training Set Was Established:
- The document implies that the WMH follow-up algorithm "does not include any machine learning/ deep learning component," suggesting a rule-based or conventional image processing algorithm. Therefore, "training" might refer to parameter tuning rather than machine learning model training.
- For the "fine-tuning the hyper parameters of WMH follow-up algorithm," the ground truth establishment method for this training data is not explicitly detailed in the provided text. It only states that this data was "independent of the data used to test" the algorithm.
Ask a specific question about this device
(149 days)
Imagine® Enterprise Suite (IES) is a medical diagnostic device that receives, stores, and shares the medical images from and to DICOM-compliant entities such as imaging modalities (such as X-ray Angiograms (XA), Echocardiograms (US), MRI, CT, CR, DR, IVUS, OCT, PET and SPECT), external PACS, and other diagnostic workstations. It is used in the display and quantification of medical images, after image acquisition from modalities, for post-procedure clinical decision support. It constitutes a PACS for the communication and storage of medical images and provides a worklist of stored medical images that can be used to open patient studies in one of its image viewers. It is intended to display images and related information that are interpreted by trained professionals to render findings and/or diagnosis, but it does not directly generate any diagnosis or potential findings. Not intended for primary diagnosis of mammographic images. Not intended for intra-procedural or real-time use. Not intended for diagnostic use on mobile devices.
The Imagine® Enterprise Suite (IES) has, as its backbone, the IES PACS – a DICOM stack for the communication and storage of medical images. It is based on its predecessor, the HCP DICOM Net® PACS (K023467). The IES is made up of the following modules:
IES_EntViewer: This viewer module can be launched from the IES PACS Worklist and is intended primarily for the review and manipulation of angiographic X-ray images. It also supports the review of images from other modalities in single or combination views, thereby serving as a general-purpose multi-modality viewer.
IES_EchoViewer: This viewer module can be launched from the IES Worklist and is intended for specialized viewing, manipulation, and measurements of Echocardiography images.
IES_RadViewer: This viewer module can be launched from the IES Worklist and is intended for specialized viewing, manipulation, and measurements of Radiological images. It also supports the fusion of Radiological images (such as MRI and CT) with Nuclear Medicine images (such as PET and SPECT).
IES_ZFPViewer: This viewer is intended for non-diagnostic review of medical images over a web browser. It supports an independent worklist and a viewing component that requires no installation for the end user. It works within an intranet or over the internet via user-provided VPN or static IP.
AngioQuant: This module can be launched from the IES_EntViewer to perform automatic quantification of coronary arteries. It uses, as input, the cardiac angiogram studies stored on the IES PACS. It is intended for display and quantification of Xray angiographic images after image acquisition in the cathlab, for post-procedure clinical decision support within the cathlab workflow. It is not intended for intra-procedural or real-time use. The Imagine® Enterprise Suite (IES) is integrated with ML only for the segmentation of coronary vessels from X-ray angiographic images and uses deep learning methodology for image analysis.
Here's a breakdown of the acceptance criteria and study details for the Imagine® Enterprise Suite, specifically focusing on the AngioQuant module's machine learning component, as described in the provided 510(k) summary:
1. Table of Acceptance Criteria and Reported Device Performance
The 510(k) summary provides a narrative description of the performance evaluation rather than a direct table of acceptance criteria with corresponding performance metrics for every criterion. However, it explicitly states that the performance of the IES_AngioQuant module's machine learning-based coronary vessel segmentation function was evaluated using several metrics and compared against an FDA-cleared predicate device.
| Acceptance Criterion (Inferred from Study Design) | Reported Device Performance (IES_AngioQuant ML component) |
|---|---|
| Quantitative Performance Metrics for Coronary Vessel Segmentation | Evaluated using: |
| Jaccard Index (Intersection over Union) | Value not explicitly stated, but was among the comprehensive set of metrics used for evaluation. |
| Dice Score | Value not explicitly stated, but was among the comprehensive set of metrics used for evaluation. |
| Precision | Value not explicitly stated, but was among the comprehensive set of metrics used for evaluation. |
| Accuracy | Value not explicitly stated, but was among the comprehensive set of metrics used for evaluation. |
| Recall | Value not explicitly stated, but was among the comprehensive set of metrics used for evaluation. |
| Visual Assessment of Segmentation | Conducted in conjunction with quantitative metrics. |
| Comparative Performance to Predicate Device | Performance was compared against the FDA-cleared predicate device, CAAS Workstation (510(k) No. K232147). |
| Reproducibility/Consistency of Ground Truth (Implicit for verification) | Verification performed by two independent board-certified interventional cardiologists. |
Note: The specific numerical values for Jaccard Index, Dice Score, Precision, Accuracy, and Recall are not provided in the summary. The summary highlights that these metrics were used for evaluation.
2. Sample Size and Data Provenance
- Test Set Sample Size: An independent external test set comprising 30 patient studies was used.
- Data Provenance: The dataset consisted of anonymized angiographic studies sourced from multiple U.S. and international clinical sites. It was a retrospective dataset. The dataset included adult patients of mixed gender and represented a range of age, body habitus, and diverse race and ethnicity. Clinically relevant variability, including lesion severity, vessel anatomy, image quality, and imaging equipment vendors, was represented.
3. Number of Experts and Qualifications for Ground Truth
- Number of Experts: Two independent board-certified interventional cardiologists.
- Qualifications of Experts: Each expert had more than 10 years of clinical experience.
4. Adjudication Method for the Test Set
The summary does not explicitly state a formal adjudication method like "2+1" or "3+1" for differences between the experts. However, it states that the ground truth (reference standard) was established using the FDA-cleared Medis QAngio XA (K182611) software, with verification performed by the two independent board-certified interventional cardiologists. This implies that the experts reviewed and confirmed the ground truth generated by the predicate software, rather than independently generating it and then adjudicating differences.
5. MRMC Comparative Effectiveness Study
An MRMC comparative effectiveness study was not explicitly described in the summary. The performance comparison was primarily an algorithm-only comparison against a predicate device (CAAS Workstation) for the ML component. The summary does not mention how much human readers improve with or without AI assistance.
6. Standalone (Algorithm Only) Performance
Yes, a standalone (algorithm only without human-in-the-loop performance) study was done for the IES_AngioQuant module's machine learning-based coronary vessel segmentation function. Its performance was evaluated using quantitative metrics and visual assessment, and then compared against the FDA-cleared predicate device (CAAS Workstation).
7. Type of Ground Truth Used
The ground truth was established using an FDA-cleared software (Medis QAngio XA, K182611), with its output verified by expert consensus of two independent board-certified interventional cardiologists.
8. Sample Size for the Training Set
A total of 762 anonymized angiographic studies were used for training, validation, and internal testing sets combined. The summary does not provide an exact breakdown of how many studies were specifically in the training set versus the validation and internal testing sets.
9. How the Ground Truth for the Training Set Was Established
The summary states that the ground truth ("truthing") for the dataset (which includes the training, validation, and internal testing sets) was established using the FDA-cleared Medis QAngio XA (K182611) software, with verification performed by two independent board-certified interventional cardiologists, each with more than 10 years of clinical experience. Implicitly, this same method was used for establishing ground truth for the training set.
Ask a specific question about this device
(47 days)
TruSPECT is intended for acceptance, transfer, display, storage, and processing of images for detection of radioisotope tracer uptakes in the patient's body. The device using various processing modes supported by the various clinical applications and various features designed to enhance image quality. The emission computerized tomography data can be coupled with registered and/or fused CT/MR scans and with physiological signals in order to depict, localize, and/or quantify the distribution of radionuclide tracers and anatomical structures in scanned body tissue for clinical diagnostic purposes. The acquired tomographic image may undergo emission-based attenuation correction.
Visualization tools include segmentation, colour coding, and polar maps. Analysis tools include Quantitative Perfusion SPECT (QPS), Quantitative Gated SPECT (QGS) and Quantitative Blood Pool Gated SPECT (QBS) measurements, Multi Gated Acquisition (MUGA) and Heart-to-Mediastinum activity ratio (H/M).
The system also includes reporting tools for formatting findings and user selected areas of interest. It is capable of processing and displaying the acquired information in traditional formats, as well as in three-dimensional renderings, and in various forms of animated sequences, showing kinetic attributes of the imaged organs.
TruSPECT is based on Windows operating system. Due to special customer requirements and the clinical focus the TruSPECT can be configured with different combinations of Windows OS based software options and clinical applications which are intended to assist the physician in diagnosis and/or treatment planning. This includes commercially available post-processing software packages.
TruSPECT is a processing workstation primarily intended for, but not limited to cardiac applications. The workstation can be integrated with the D-SPECT cardiac scanner system or used as a standalone post-processing station.
The TruSPECT Processing Station is a software-only medical device (SaMD) designed to operate on a dedicated, high-performance computer platform. It is distributed as pre-installed medical imaging software intended to support image visualization, quantitation, analysis, and comparison across multiple imaging modalities and acquisition time points. The software supports both functional imaging modalities, such as Single Photon Emission Computed Tomography (SPECT) and Nuclear Medicine (NM), as well as anatomical imaging modalities, such as Computed Tomography (CT).
The system enables integration, display, and analysis of multimodal image datasets to assist qualified healthcare professionals in image review and interpretation within the clinical workflow. The software is intended for use by trained medical professionals and assists in image assessment for various clinical applications, including but not limited to cardiology, electrophysiology, and organ function evaluation. The software does not perform automated diagnosis and does not replace the clinical judgment of the user.
The TruSPECT software operates on the Microsoft Windows® operating system and can be configured with various software modules and clinical applications according to user requirements and intended use. The configuration may include proprietary Spectrum Dynamics modules and commercially available third-party post-processing software packages operating within the TruSPECT framework.
The modified TruSPECT system integrates the TruClear AI application as part of its software suite. The TruClear AI module is a software-based image processing component designed to assist in the enhancement of SPECT image data acquired on the TruSPECT system. The module operates within the existing reconstruction and review workflow and does not alter the system's intended use, indications for use, or fundamental technology.
Verification and validation activities were performed to confirm that the addition of the TruClear AI module functions as intended and that overall system performance remains consistent with the previously cleared TruSPECT configuration. These activities included performance evaluations using simulated phantom datasets and representative clinical image data, conducted in accordance with FDA guidance. The results demonstrated that the modified TruSPECT system incorporating TruClear AI meets all predefined performance specifications and continues to operate within the parameters of its intended clinical use.
Here's a breakdown of the acceptance criteria and study details for the TruClear AI module of the TruSPECT Processing Station, based on the provided FDA 510(k) clearance letter:
Acceptance Criteria and Reported Device Performance
| Parameter | Acceptance Criteria | Reported Device Performance (Key Performance Results) |
|---|---|---|
| LVEF | Bland Altman Mean: ±3% | Strong correlation (r=0.94). Bland–Altman analyses showed mean differences within pre-specified acceptance criteria. |
| Bland Altman SD: ≤ 4% | (Implicitly met as mean differences were within criteria) | |
| Regression r (min): > 0.8 | r=0.94 | |
| Slope (range): 0.9 – 1.1 | (Implicitly met as mean differences were within criteria) | |
| Intercept (limit): ± 10% | (Implicitly met as mean differences were within criteria) | |
| EDV | Bland Altman Mean: ± 5 ml | Strong correlation (r=0.98). Bland–Altman analyses showed mean differences within pre-specified acceptance criteria. |
| Bland Altman SD: ≤ 8 ml | (Implicitly met as mean differences were within criteria) | |
| Regression r (min): > 0.8 | r=0.98 | |
| Slope (range): 0.9 – 1.1 | (Implicitly met as mean differences were within criteria) | |
| Intercept (limit): ± 10 ml | (Implicitly met as mean differences were within criteria) | |
| Perfusion Volume | Bland Altman Mean: ± 5 ml | Strong correlation. Bland–Altman analyses showed mean differences within pre-specified acceptance criteria. |
| Bland Altman SD: ≤ 8 ml | (Implicitly met as mean differences were within criteria) | |
| Regression r (min): > 0.8 | (Implicitly met as strong correlation noted) | |
| Slope (range): 0.9 – 1.1 | (Implicitly met as mean differences were within criteria) | |
| Intercept (limit): ± 10 ml | (Implicitly met as mean differences were within criteria) | |
| TPD | Bland Altman Mean: ± 3% | Strong correlation (r=0.98). Bland–Altman analyses showed mean differences within pre-specified acceptance criteria. |
| Bland Altman SD: ≤ 5% | (Implicitly met as mean differences were within criteria) | |
| Regression r (min): > 0.8 | r=0.98 | |
| Slope (range): 0.9 – 1.1 | (Implicitly met as mean differences were within criteria) | |
| Intercept (limit): ± 10% | (Implicitly met as mean differences were within criteria) | |
| Visual Similarity (Denoised vs. Reference) | (Not explicitly quantified as a numeric acceptance criterion range, but implied) | Denoised images were 'similar' to reference, consistent with high inter-reader agreement. Visual similarity ratings indicated denoised images were 'similar' to reference. |
| Inter-observer Agreement (Visual Comparison) | (Not explicitly quantified as an acceptance criterion) | 97–100% after dichotomization (scores ≥3 vs <3) across key metrics. |
Study Details
-
Sample size used for the test set and the data provenance:
- Test Set Sample Size: 24 patients (8 female, 16 male), which yielded 74 images.
- Data Provenance: Multi-center, retrospective dataset from three hospitals in the UK and Germany.
-
Number of experts used to establish the ground truth for the test set and the qualifications of those experts:
- Number of Experts: Two (2)
- Qualifications of Experts: Independent, board-certified nuclear medicine physicians.
-
Adjudication method for the test set:
- The document states "two independent, board-certified nuclear medicine physicians visually compared denoised low-count images to the high-count reference using a 5-point Likert scale; inter-observer percent agreement after dichotomization (scores ≥3 vs <3) was 97–100% across key metrics." This suggests a consensus-based approach for establishing some aspect of the ground truth, particularly for the visual similarity assessment, though not explicitly a formal 2+1 or 3+1 adjudication for defining disease status. The reference standard itself was the high-count image, and the experts were comparing the derived AI-processed images to this reference.
-
If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:
- An MRMC comparative effectiveness study was not explicitly described in terms of human readers improving with AI vs. without AI assistance. The study focused on validating the AI algorithm's output against a reference standard (high-count image) using visual and quantitative assessment. The two nuclear medicine physicians visually compared the denoised images to the reference, not their own diagnostic performance with and without AI.
-
If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:
- Yes, a standalone performance assessment of the algorithm was conducted. The quantitative evaluation using the FDA-cleared Cedars-Sinai QPS/QGS to derive perfusion and functional parameters (TPD, volume, EDV, LVEF) directly compared the algorithm's output on low-count images (after denoising) to the high-count reference images. The Bland-Altman and correlation analyses are indicators of standalone performance.
-
The type of ground truth used:
- The primary reference standard (ground truth) for the study was the clinical routine high-count SPECT image (~1.0 MCounts) acquired under standard D-SPECT protocols.
- For quantitative parameters, FDA-cleared Cedars-Sinai QPS/QGS was used on the high-count reference images to derive the ground truth values for perfusion and functional parameters (TPD, volume, EDV, LVEF).
- For visual assessment, the "high-count reference" images served as the ground truth for comparison.
-
The sample size for the training set:
- The total dataset was 352 patients. The training/tuning set consisted of a portion of these patients; specifically, the "held-out test set" was 24 patients, meaning the remaining 328 patients (352 - 24) were used for training and tuning the algorithm.
-
How the ground truth for the training set was established:
- The document implies the same ground truth methodology was used for the training set as for the test set. The algorithm was trained to transform low-count images to effectively match the characteristics of the clinical routine high-count SPECT image as the "gold standard." The Cedars-Sinai QPS/QGS would also have been used on these high-count images to generate the quantitative targets for training, allowing the AI to learn to derive similar quantitative parameters from denoised low-count images.
Ask a specific question about this device
(104 days)
PeekMed web is a system designed to help healthcare professionals carry out pre-operative planning for several surgical procedures, based on their imported patients' imaging studies. Experience in usage and a clinical assessment are necessary for the proper use of the system in the revision and approval of the output of the planning. The multi-platform system works with a database of digital representations related to surgical materials supplied by their manufacturers.
This medical device consists of a decision support tool for qualified healthcare professionals to quickly and efficiently perform the pre-operative planning for several surgical procedures, using medical imaging with the additional capability of planning the 2D or 3D environment. The system is designed for the medical specialties within surgery, and no specific use environment is mandatory, whereas the typical use environment is a room with a computer. The patient target group is adult patients who have an injury or disability diagnosed previously. There are no other considerations for the intended patient population.
PeekMed web is a system designed to help healthcare professionals carry out pre-operative planning for several surgical procedures, based on their imported patients' imaging studies. Experience in usage and a clinical assessment are necessary for the proper use of the system in the revision and approval of the output of the planning.
The multi-platform system works with a database of digital representations related to surgical materials supplied by their manufacturers.
As the PeekMed web is capable of representing medical images in a 2D or 3D environment, performing relevant measurements on those images, and also capable of adding templates, it can then provide a total overview of the surgery. Being software, it does not interact with any part of the body of the user and/or patient.
The acceptance criteria and study proving device performance are described below, based on the provided FDA 510(k) clearance letter for PeekMed web (K252856).
1. Table of Acceptance Criteria and Reported Device Performance
The provided document lists the acceptance criteria but does not explicitly state the reported device performance for each metric from the validation studies. It only states that the efficacy results "met the acceptance criteria for ML model performance." Therefore, the "Reported Device Performance" column reflects this general statement.
| ML model | Acceptance Criteria | Reported Device Performance |
|---|---|---|
| Segmentation | DICE is no less than 90%HD-95 is no more than 8STD DICE is between +/- 10%Precision is more than 85%Recall is more than 90% | Met the acceptance criteria for ML model performance |
| Landmarking | MRE is no more than 7mmSTD MRE is between +/- 5mm | Met the acceptance criteria for ML model performance |
| Classification | Accuracy is no less than 90%.Precision is no less than 85%Recall is no less than 90%F1 score is no less than 90% | Met the acceptance criteria for ML model performance |
| Detection | MAP is no less than 90%.Precision is no less than 85%Recall is no less than 90% | Met the acceptance criteria for ML model performance |
| Reconstruction | DICE is no less than 90%HD-95 is no more than 8STD DICE is between +/- 10%Precision is more than 85%Recall is more than 90% | Met the acceptance criteria for ML model performance |
2. Sample Size Used for the Test Set and Data Provenance
The document distinguishes between a "testing" dataset (used for internal evaluation during development) and an "external validation" dataset. The external validation dataset serves as the independent test set for assessing final model performance.
- Test Set (External Validation):
- Segmentation ML model: 672 unique datasets
- Landmarking ML model: 561 unique datasets
- Classification ML model: 367 unique datasets
- Detection ML model: 198 unique datasets
- Reconstruction ML model: 87 unique datasets
- Data Provenance: The document states that ML models were developed with datasets "from multiple sites." It does not specify the country of origin of the data nor explicitly state whether the data was retrospective or prospective, though "external validation datasets were collected independently of the development data" and "labeled by a separate team," suggesting a retrospective approach to data collection for the validation.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts
The document mentions that the "External validation...was employed to provide an accurate assessment of the model's performance." and that the dataset was "labeled by a separate team". It does not specify the number of experts used or their specific qualifications (e.g., "radiologist with 10 years of experience").
4. Adjudication Method for the Test Set
The document states that the ground truth for the external validation dataset was "labeled by a separate team." It does not specify an adjudication method such as 2+1, 3+1, or if multiple experts were involved and how discrepancies were resolved.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done
No, the document does not indicate that a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was done to evaluate how much human readers improve with AI vs. without AI assistance. The testing focused on the standalone performance of the ML models against a predefined ground truth.
6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done
Yes, a standalone performance evaluation (algorithm only without human-in-the-loop) was done. The performance data section describes the "efficacy results of the [specific] ML model using the testing and external validation datasets against the predefined ground truth," indicating an assessment of the algorithm's performance independent of human interaction during the measurement. The device is described as a "decision support tool" requiring "clinical assessment... for the proper use of the system in the revision and approval of the output," implying the algorithm provides output that a human reviews, but the performance testing described here is on the raw algorithm output.
7. The Type of Ground Truth Used
The ground truth used for both the training and test sets is referred to as "predefined ground truth" and established by "labeling" or a "separate team" for the external validation sets. This implies a human-generated expert consensus or annotation-based ground truth, although the specific expertise and method of consensus are not detailed. It is not explicitly stated as pathology or outcomes data.
8. The Sample Size for the Training Set
The ML models were trained with datasets from multiple sites totaling:
- 2852 X-ray datasets
- 2073 CT scans
- 209 MRIs
These total datasets were split as follows:
- Training Set: 80% of the total dataset for each modality.
- X-ray: 0.80 * 2852 = 2281.6 (approx. 2282)
- CT scans: 0.80 * 2073 = 1658.4 (approx. 1658)
- MRIs: 0.80 * 209 = 167.2 (approx. 167)
9. How the Ground Truth for the Training Set Was Established
The document states, "ML models were developed with datasets...We trained the ML models with 80% of the dataset..." and refers to "predefined ground truth." While it doesn't explicitly detail the process for training data, it is implied that the training data also had human-generated ground truth (annotations/labels), similar to the validation data, as ML models rely on labeled data for supervised learning. It mentions that "leakage between development and validation data sets did not occur," and the external validation set was "labeled by a separate team," suggesting the training data was also labeled by experts, possibly the "internal procedures" mentioned for ML model development.
Ask a specific question about this device
(116 days)
Alzevita is intended for use by neurologists and radiologists experienced in the interpretation and analysis of brain MRI scans. It enables automated labelling, visualization, and volumetric measurement of the hippocampus from high-resolution T1-weighted MRI images. The software facilitates comparison of hippocampal volume against a normative dataset derived from MRI scans of healthy control subjects aged 55 to 90 years, acquired using standardized imaging protocols on 1.5T/3T MRI scanners.
Alzevita is a cloud-based, AI-powered medical image processing software as a medical device intended to assist neurologists and radiologists with expertise in the analysis of 3D brain MRI scans. The software performs fully automated segmentation and volumetric quantification of the hippocampus, a brain structure involved in memory and commonly affected by neurodegenerative conditions.
Alzevita is designed to replace manual hippocampal segmentation workflows with a fast, reproducible, and standardized process. It provides quantitative measurements of hippocampal volume, enabling consistent outputs that can assist healthcare professionals in evaluating structural brain changes.
The software operates through a secure web interface and is compatible with commonly used operating systems and browsers. It accepts 3D MRI scans in DICOM or NIfTI format and displays the MRI image in the MRI viewer allowing trained healthcare professionals to view, zoom, and analyze the MRI scan alongside providing a visual and tabular volumetric analysis report.
The underlying algorithm used in Alzevita is locked, meaning it does not modify its behavior at runtime or adapt to new inputs. This ensures consistent performance and reproducibility of results across users and imaging conditions. Any future modifications to the algorithm including performance updates or model re-training will be submitted to the FDA for review and clearance prior to deployment, in compliance with FDA regulatory requirements and applicable guidance for AI/ML-based SaMD.
Here's a detailed description of the acceptance criteria and the study proving the Alzevita device meets those criteria, based on the provided FDA 510(k) clearance letter:
Acceptance Criteria and Device Performance
1. Table of Acceptance Criteria and Reported Device Performance
| Metric | Acceptance Criteria | Reported Device Performance (Alzevita 95% Confidence Intervals) | Criteria (Pass/Fail) |
|---|---|---|---|
| Overall Dice Score | ≥ 75% | (0.85, 0.86) | Pass |
| Overall Hausdorff Distance | ≤ 6.1 mm | (1.43, 1.59) | Pass |
| Overall Correlation Coefficient | ≥ 0.82 | Not explicitly given as CI, but stated as met | Pass |
| Overall Relative Volume Difference | ≤ 24.6% | Not explicitly given as CI, but stated as met | Pass |
| Overall Bland-Altman Mean Difference (Total Hippocampus Volume) | ≤ 1010 mm³ | Not explicitly given as CI, but stated as met | Pass |
| Subgroup Dice Score (Clinical Subgroups) | ≥ 83% (implied from results) | Control: (0.87, 0.88)MCI: (0.84, 0.85)AD: (0.82, 0.84) | Pass |
| Subgroup Hausdorff Distance (Clinical Subgroups) | ≤ 3 mm (implied from results) | Control: (1.32, 1.41)MCI: (1.44, 1.62)AD: (1.48, 2.10) | Pass |
| Subgroup Dice Score (Gender) | ≥ 83% (implied) | Female: (0.85, 0.87)Male: (0.84, 0.86) | Pass |
| Subgroup Hausdorff Distance (Gender) | ≤ 3 mm (implied) | Female: (1.40, 1.57)Male: (1.41, 1.66) | Pass |
| Subgroup Dice Score (Magnetic Field Strength) | ≥ 83% (implied) | 3T: (0.86, 0.87)1.5T: (0.83, 0.85) | Pass |
| Subgroup Hausdorff Distance (Magnetic Field Strength) | ≤ 3 mm (implied) | 3T: (1.38, 1.47)1.5T: (1.45, 1.79) | Pass |
| Subgroup Dice Score (Slice Thickness) | ≥ 83% (implied) | 1 mm: (0.87, 0.88)1.2 mm: (0.84, 0.85) | Pass |
| Subgroup Hausdorff Distance (Slice Thickness) | ≤ 3 mm (implied) | 1 mm: (1.35, 1.43)1.2 mm: (1.47, 1.72) | Pass |
| Subgroup Dice Score (US Geographical Region) | ≥ 83% (implied) | East US: (0.84, 0.86)West US: (0.85, 0.87)Central US: (0.85, 0.87)Canada: (0.82, 0.88) | Pass |
| Subgroup Hausdorff Distance (US Geographical Region) | ≤ 3 mm (implied) | East US: (1.44, 1.71)West US: (1.35, 1.55)Central US: (1.35, 1.47)Canada: (1.07, 2.34) | Pass |
2. Sample Size Used for the Test Set and Data Provenance
- Sample Size for Test Set: 298 subjects.
- Data Provenance: The test set data was collected from the publicly available ADNI (Alzheimer's Disease Neuroimaging Initiative) dataset. It is retrospective and sampled using stratified random sampling, with subjects recruited from ADNI 1 & ADNI 3 datasets.
- Geographical Distribution: Approximately equal geographical distribution within the USA (East coast, Central US regions, West coast) and Canada.
3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications
- Number of Experts: Three certified radiologists.
- Qualifications of Experts: They are described as "certified radiologists in India, adhering to widely recognized and standardized segmentation protocols." Specific experience level (e.g., years of experience) is not provided.
4. Adjudication Method for the Test Set
- Adjudication Method: A consensus ground truth was established by integrating individual delineations from the three certified radiologists into a single consensus mask for each case. This integration was performed using the STAPLE (Simultaneous Truth and Performance Level Estimation) algorithm.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- Was a MRMC study done? No, the document describes a standalone performance evaluation of the Alzevita algorithm against a consensus ground truth. There is no mention of a human-in-the-loop study comparing human readers with and without AI assistance.
- Effect size of human readers improvement: Not applicable, as no MRMC study was conducted.
6. Standalone Performance Study
- Was a standalone performance study done? Yes. The entire validation study described evaluates the Alzevita algorithm's performance in segmenting the hippocampus and calculating its volume against a ground truth, without human intervention in the segmentation process.
7. Type of Ground Truth Used
- Type of Ground Truth: Expert consensus. Specifically, it was established through manual segmentation by three certified radiologists, with their individual segmentations integrated via the STAPLE algorithm. This STAPLE-derived consensus mask served as the ground truth.
8. Sample Size for the Training Set
- Sample Size for Training Set: 200 cases.
9. How the Ground Truth for the Training Set Was Established
- Training Set Ground Truth Establishment: "Expert radiologists manually segmented the hippocampus to create the ground truth, which is then used as input for training the Alzevita segmentation model." The number and specific qualifications of the expert radiologists for the training set's ground truth are not detailed beyond "expert radiologists." There is no mention of an adjudication method like STAPLE for the training set ground truth, suggesting individual expert segmentation or an unspecified consensus process.
Ask a specific question about this device
(172 days)
AI-CVD® is an opportunistic AI-powered quantitative imaging tool that provides automated CT-derived anatomical and density-based measurements for clinician review. The device does not provide diagnostic interpretation or risk prediction. It is solely intended to aid physicians and other healthcare providers in determining whether additional diagnostic tests are appropriate for implementing preventive healthcare plans. AI-CVD® has a modular structure where each module is intended to report quantitative imaging measurements for each specific component of the CT scan. AI-CVD® quantitative imaging measurement modules include coronary artery calcium (CAC) score, aortic wall calcium score, aortic valve calcium score, mitral valve calcium score, cardiac chambers volumetry, epicardial fat volumetry, aorta and pulmonary artery sizing, lung density, liver density, bone mineral density, and muscle & fat composition.
Using AI-CVD® quantitative imaging measurements and their clinical evaluation, healthcare providers can investigate patients who are unaware of their risk of coronary heart disease, heart failure, atrial fibrillation, stroke, osteoporosis, liver steatosis, diabetes, and other adverse health conditions that may warrant additional risk assessment, monitoring or follow-up. AI-CVD® quantitative imaging measurements are to be reviewed by radiologists or other medical professionals and should only be used by healthcare providers in conjunction with clinical evaluation.
AI-CVD® is not intended to rule out the risk of cardiovascular diseases. AI-CVD® opportunistic screening software can be applied to non-contrast thoracic CT scans such as those obtained for CAC scans, lung cancer screening scans, and other chest diagnostic CT scans. Similarly, AI-CVD® opportunistic screening software can be applied to contrast-enhanced CT scans such as coronary CT angiography (CCTA) and CT pulmonary angiography (CTPA) scans. AI-CVD® opportunistic bone density module and liver density module can be applied to CT scans of the abdomen and pelvis. All volumetric quantitative imaging measurements from the AI-CVD® opportunistic screening software are adjusted by body surface area (BSA) and reported both in cubic centimeter volume (cc) and percentiles by gender reference data from people who participated in the Multi-Ethnic Study of Atherosclerosis (MESA) and Framingham Heart Study (FHS). Except for coronary artery calcium scoring, other AI-CVD® modules should not be ordered as a standalone CT scan but instead should be used as an opportunistic add-on to existing and new CT scans.
AI-CVD® is an opportunistic AI-powered modular tool that provides automated quantitative imaging reports on CT scans and outputs the following measurements:
- Coronary Artery Calcium Score
- Aortic Wall and Valves Calcium Scores
- Mitral Valve Calcium Score
- Cardiac Chambers Volume
- Epicardial Fat Volume
- Aorta and Main Pulmonary Artery Volume and Diameters
- Liver Attenuation Index
- Lung Attenuation Index
- Muscle and Visceral Fat
- Bone Mineral Density
The above quantitative imaging measurements enable care providers to take necessary actions to prevent adverse health outcomes.
AI-CVD® modules are installed by trained personnel only. AI-CVD® is executed via parent software which provides the necessary inputs and receives the outputs. The software itself does not offer user controls or access.
AI-CVD® reads a CT scan (in DICOM format) and extracts scan specific information like acquisition time, pixel size, scanner type, etc. AI-CVD® uses trained AI models that automatically segment and report quantitative imaging measurements specific to each AI-CVD® module. The output of each AI-CVD® module is inputted into the parent software which exports the results for review and confirmation by a human expert.
AI-CVD® is a post-processing tool that works on existing and new CT scans.
AI-CVD® passes if the human expert approves the segmentation highlighted by the AI-CVD® module is correctly placed on the target anatomical region. For example, Software passes if the human expert sees the AI-CVD® cardiac chamber volumetry module highlighted the heart anatomy.
AI-CVD® fails if the human expert sees the segmentation highlighted by the AI-CVD® module is not correctly placed on the target anatomical region. For example, Software fails if the human expert sees the AI-CVD® cardiac chamber volumetry module highlighted the lungs anatomy or a portion of the sternum or any adjacent organs. Furthermore, Software fails if the human expert sees that the quality of the CT scan is compromised by image artifacts, severe motion, or excessive noise.
The user cannot change or edit the segmentation or results of the device. The user must accept or reject the segmentation where the AI-CVD® quantitative imaging measurements are performed.
AI-CVD® is an AI-powered post-processing tool that works on non-contrast and contrast-enhanced CT scans of chest and abdomen.
AI-CVD® is a multi-module deep learning-based software platform developed to automatically segment and quantify a broad range of cardiovascular, pulmonary, musculoskeletal, and metabolic biomarkers from standard chest or whole-body CT scans. AI-CVD® system builds upon the open-source TotalSegmentator as its foundational segmentation framework, incorporating additional supervised learning and model training layers specific to each module's clinical task.
The provided FDA 510(k) Clearance Letter for AI-CVD® outlines several modules, each with its own evaluation. However, the document does not provide a single, comprehensive table of acceptance criteria with reported device performance for all modules. Instead, it describes clinical validation studies and agreement analyses, generally stating "acceptable bias and reproducibility" or "acceptable agreement and reproducibility" without specific numerical thresholds or metrics. Similarly, detailed information on sample sizes, ground truth establishment methods (beyond general "manual reference standards" or "human expert knowledge"), and expert qualifications is quite limited for most modules.
Here's an attempt to extract and synthesize the information based on the provided text, recognizing the gaps:
Acceptance Criteria and Study Details for AI-CVD®
1. Table of Acceptance Criteria and Reported Device Performance
The document does not explicitly state numerical acceptance criteria for each module. Instead, it describes performance in terms of agreement with manual measurements or gold standard references, generally stating "acceptable bias and reproducibility" or "comparable performance." The table below summarizes what is reported.
| AI-CVD® Module | Acceptance Criteria (Implicit/General) | Reported Device Performance |
|---|---|---|
| Coronary Artery Calcium Score | Comparative safety and effectiveness with expert manual measurements. | Demonstrated comparative safety and effectiveness between expert manual measurements and both automated Agatston CAC scores and AI-derived relative density-based calcium scores. |
| Aortic Wall & Aortic Valve Calcium Scores | Acceptable bias and reproducibility compared to manual reference standards. | Bland-Altman agreement analyses demonstrated acceptable bias and reproducibility across imaging protocols. |
| Mitral Valve Calcium Score | Reproducible quantification compared to manual measurements. | Agreement analyses demonstrated reproducible mitral valve calcium quantification across imaging protocols. |
| Cardiac Chambers Volume | Based on previously FDA-cleared technology (AutoChamber™ K240786). | (No new performance data presented for this specific module as it leverages a cleared predicate). |
| Epicardial Fat Volume | Acceptable agreement and reproducibility with manual measurements. | Agreement studies comparing AI-derived epicardial fat volumes with manual measurements and across non-contrast and contrast-enhanced CT acquisitions demonstrated acceptable agreement and reproducibility. |
| Aorta & Main Pulmonary Artery Volume & Diameters | Low bias and comparable performance with manual reference measurements. | Agreement studies comparing AI-derived measurements with manual reference measurements demonstrated low bias and comparable performance across gated and non-gated CT acquisitions. Findings support reliability. |
| Liver Attenuation Index | Acceptable reproducibility across imaging protocols. | Agreement analysis comparing AI-derived liver attenuation measurements across imaging protocols demonstrated acceptable reproducibility. |
| Lung Attenuation Index | Reproducible measurements across CT acquisitions. | Agreement studies demonstrated reproducible lung density measurements across gated and non-gated CT acquisitions. |
| Muscle & Visceral Fat | Acceptable reproducibility across imaging protocols. | Agreement analyses between AI-derived fat and muscle measurements demonstrated acceptable reproducibility across imaging protocols. |
| Bone Mineral Density | Based on previously FDA-cleared technology (AutoBMD K213760). | (No new performance data presented for this specific module as it leverages a cleared predicate). |
2. Sample Size and Data Provenance for the Test Set
- Coronary Artery Calcium (CAC) Score:
- Sample Size: 913 consecutive coronary calcium screening CT scans.
- Data Provenance: "Real-world" data acquired across three community imaging centers. This suggests a retrospective collection from a U.S. or similar healthcare system, though the specific country of origin is not explicitly stated. The term "consecutive" implies that selection bias was minimized.
- Other Modules (Aortic Wall/Valve, Mitral Valve, Epicardial Fat, Aorta/Pulmonary Artery, Liver, Lung, Muscle/Visceral Fat):
- The document refers to "agreement analyses" and "agreement studies" but does not specify the sample size for the test sets used for these individual modules.
- Data Provenance: The document generally states that "clinical validation studies were performed based upon retrospective analyses of AI-CVD® measurements performed on large population cohorts such as the Multi-Ethnic Study of Atherosclerosis (MESA) and Framingham Heart Study (FHS)." It is unclear if these cohorts were solely used for retrospective analysis, or if the "real-world" data mentioned for CAC was also included for other modules. MESA and FHS are prospective, longitudinal studies conducted primarily in the U.S.
3. Number of Experts and Qualifications for Ground Truth
- Coronary Artery Calcium (CAC) Score:
- Number of Experts: Unspecified, referred to as "expert manual measurements."
- Qualifications: Unspecified, but implied to be human experts capable of performing manual Agatston scoring.
- Other Modules:
- Number of Experts: Unspecified, generally referred to as "manual reference standards" or "manual measurements."
- Qualifications: Unspecified.
4. Adjudication Method for the Test Set
The document does not describe a specific adjudication method (e.g., 2+1, 3+1) for establishing ground truth on the test set. It mentions "expert manual measurements" or "manual reference standards," suggesting that the ground truth was established by human experts, but the process of resolving discrepancies among multiple experts (if any were used) is not detailed.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
-
Was an MRMC study done? No, the document does not describe an MRMC comparative effectiveness study where human readers' performance with and without AI assistance was evaluated. The performance data presented focuses on the standalone AI performance compared to human expert measurements.
-
Effect Size of Human Reader Improvement: Not applicable, as an MRMC study was not described.
6. Standalone (Algorithm Only) Performance Study
- Was a standalone study done? Yes, the described performance evaluations for all modules (where new performance data was presented) are standalone performance studies. The studies compare the AI-CVD® algorithm's output directly against manual measurements or established reference standards.
7. Type of Ground Truth Used
- Coronary Artery Calcium Score: Expert manual measurements (Agatston scores).
- Aortic Wall and Aortic Valve Calcium Scores: Manual reference standards.
- Mitral Valve Calcium Score: Manual measurements.
- Epicardial Fat Volume: Manual measurements.
- Aorta and Main Pulmonary Artery Volume and Diameters: Manual reference measurements.
- Liver Attenuation Index: (Implicitly) Manual reference measurements or established methods for hepatic attenuation.
- Lung Attenuation Index: (Implicitly) Manual reference measurements or established methods for lung density.
- Muscle and Visceral Fat: (Implicitly) Manual reference measurements.
- Cardiac Chambers Volume & Bone Mineral Density: Leveraged previously cleared predicate devices, suggesting the ground truth for their original clearance would apply.
8. Sample Size for the Training Set
The document provides information on the foundational segmentation framework (TotalSegmentator) and hints at customization for AI-CVD® modules:
- TotalSegmentator (Foundational Framework):
- General anatomical segmentation: 1,139 total body CT cases.
- High-resolution cardiac structure segmentation: 447 coronary CT angiography (CCTA) scans.
- AI-CVD® Custom Datasets: The document states that "Custom datasets were constructed for coronary artery calcium scoring, aortic and valvular calcifications, cardiac chamber volumetry, epicardial and visceral fat quantification, bone mineral density assessment, liver fat estimation, muscle mass and quality, and lung attenuation analysis." However, it does not provide the specific sample sizes for these custom training datasets for each AI-CVD® module.
9. How Ground Truth for the Training Set Was Established
- TotalSegmentator (Foundational Framework): The architecture utilizes nnU-Net, which was trained on the described CT cases. Implicitly, these cases would have had expert-derived ground truth segmentations for training the neural network.
- AI-CVD® Custom Datasets: "For each module, iterative model enhancement was applied: human reviewers evaluated model-generated segmentations and corrected any inaccuracies, and these corrections were looped back into the training process to improve performance and generalizability." This indicates that human experts established and refined the ground truth by reviewing and correcting model-generated segmentations, which were then used for retraining. The qualifications of these "human reviewers" are not specified.
Ask a specific question about this device
(154 days)
The PVAD IQ software is intended for non-invasive analysis of ultrasound images to detect and measure structures from cardiac ultrasound of patients 18 years old and above, with a Percutaneous Ventricular Assist Device (PVAD). Such use is typically utilized for clinical decision support by a qualified physician.
PVAD IQ is a Software as a Medical Device (SaMD) solution designed to support clinicians in the positioning of Percutaneous Ventricular Assist devices (PVADs) through ultrasound image-based assessment. Percutaneous Ventricular Assist device is a temporary device used to provide hemodynamic support for patients experiencing cardiogenic shock or undergoing high-risk percutaneous coronary interventions (PCI).
The PVAD IQ software is a machine learning model (MLM) based software, that operates on ultrasound clips (as the system input) and provides two outputs with regards to PVAD patients:
-
Landmark identification and measurement - provides the two landmarks position detection, and computation of the mean distance between the two landmarks- the aortic annulus and the PVAD inlet.
-
Acceptability classification, which is a binary classification of ultrasound clips that are "acceptable" or "non-acceptable", in terms of visibility of the two landmarks. A clip is defined as acceptable when both landmarks are simultaneously visible in a manner suitable for quantitative imaging.
The User Interface (UI) enables the user to review or hide the mean distance measurement, annotate desired images, and add manual measurement, while keeping the raw data for further review as needed.
The software output is shown on the screen either as the mean distance measurement, or as a notification related to non-acceptable clips.
The PVAD IQ Software, a machine learning model (MLM) based software, provides two primary outputs for patients with Percutaneous Ventricular Assist Devices (PVADs): landmark identification and measurement (specifically, the distance between the aortic annulus and the PVAD inlet) and acceptability classification of ultrasound clips.
1. Acceptance Criteria and Reported Device Performance
The study established pre-specified acceptance criteria for the PVAD IQ software's performance, which it met.
| Acceptance Criteria | Threshold | Reported Device Performance |
|---|---|---|
| Distance Measurement (MAE) | Below 0.5 cm | 0.42 cm (95% CI: 0.38–0.47 cm) |
| Acceptability Classification (Cohen's Kappa) | Above 0.6 | 0.71 (95% CI: 0.66–0.75) |
| Landmark Detection (AUC) - PVAD Inlet | Above 0.8 | 0.92 (0.9–0.94) |
| Landmark Detection (AUC) - Aortic Annulus | Above 0.8 | 0.98 (0.95, 1) |
| Landmark Position (MAE) - PVAD Inlet | Below 0.5 cm | 0.44 cm (0.41–0.48 cm) |
| Landmark Position (MAE) - Aortic Annulus | Below 0.5 cm | 0.31 cm (0.3–0.33 cm) |
2. Sample Size and Data Provenance for Test Set
- Sample Size: 963 clips
- Number of Patients: 186 patients
- Data Provenance: Geographically distinct test datasets. While specific countries are not mentioned, the ground truth annotations were provided by US (United States) board certified cardiac sonographers. The timing (retrospective or prospective) is not specified, but the data was used for evaluating a previously trained model, which often implies a retrospective application to a held-out test set.
3. Number and Qualifications of Experts for Ground Truth (Test Set)
- Number of Experts: Not explicitly stated as a specific number of individual experts. The document refers to "US (United States) board certified cardiac sonographers."
- Qualifications of Experts: "US (United States) board certified cardiac sonographers experienced in PVAD/Impella® echocardiographic imaging."
4. Adjudication Method for Test Set
The adjudication method is not explicitly stated in the provided document. It only mentions that ground truth annotations were "provided by US (United States) board certified cardiac sonographers." It does not specify if multiple sonographers reviewed each case, how disagreements were resolved, or if a consensus mechanism (like 2+1 or 3+1) was used.
5. MRMC Comparative Effectiveness Study
An MRMC (Multi-Reader Multi-Case) comparative effectiveness study comparing AI assistance with unassisted human readers was not mentioned in the provided document. The study focused on the standalone performance of the PVAD IQ software.
6. Standalone Performance Study
Yes, a standalone (algorithm only without human-in-the-loop performance) study was conducted. The reported performance metrics (MAE, Cohen's Kappa, AUC) directly assess the algorithm's performance against the established ground truth.
7. Type of Ground Truth Used
The ground truth used was expert consensus/annotations. Specifically, "Ground truth annotations for the distance between the aortic annulus and the PVAD inlet were provided by US (United States) board certified cardiac sonographers experienced in PVAD/Impella® echocardiographic imaging." This implies human experts manually defining the "correct" measurements and classifications.
8. Sample Size for the Training Set
The sample size for the training set is not provided in this document. The document states that the PVAD IQ software is "trained with clinical data" but does not specify the volume or characteristics of this training data.
9. How Ground Truth for Training Set Was Established
The method for establishing ground truth for the training set is not explicitly detailed in this document. It broadly states that the software uses "non-adaptive machine learning algorithms trained with clinical data" and "refining annotations" is part of model retraining (under PCCP). While it can be inferred that ground truth for training data would also involve expert annotations, similar to the test set, the specific process, number of experts, or their qualifications for the training data are not provided.
Ask a specific question about this device
Page 1 of 27