Search Results
Found 271 results
510(k) Data Aggregation
(119 days)
Ask a specific question about this device
(96 days)
Ask a specific question about this device
(235 days)
Ask a specific question about this device
(205 days)
Ask a specific question about this device
(105 days)
Ask a specific question about this device
(106 days)
Ligence Heart is a fully automated software platform that processes, analyses and makes measurements on acquired transthoracic cardiac ultrasound images, automatically producing a full report with measurements of several key cardiac structural and functional parameters. The data produced by this software is intended to be used to support qualified cardiologists or sonographers for clinical decision making. Ligence Heart is indicated for use in adult patients. Ligence Heart has not been validated for the assessment of congenital heart disease, valve disease, pericardial disease, and/or intra-cardiac lesions (e.g., tumors, thrombi).
Not Found
N/A
Ask a specific question about this device
(32 days)
AI Platform 2.2 is intended for noninvasive processing of ultrasound images to detect, measure, and calculate relevant medical parameters of structures and function of patients with suspected disease. In addition, it can provide Quality Score feedback to assist healthcare professionals, trained and qualified to conduct echocardiography, abdominal, and lung ultrasound scans in the current standard of care while acquiring ultrasound images. The device is intended to be used on images of adult patients.
Exo AI Platform 2.2 (AIP 2.2) is a software as a medical device (SaMD) that helps qualified users with image-based assessment of ultrasound examinations in adult patients. It is designed to simplify workflow by helping trained healthcare providers evaluate, quantify, and generate reports for ultrasound images. AIP 2.2 takes as an input in the Digital Imaging and Communications in Medicine (DICOM) format from ultrasound scanners of a specific range and allows users to detect, measure, and calculate relevant medical parameters of structures and function of patients with suspected disease. In addition, it provides frame and clip quality score in real-time for the Left Ventricle from the four-chamber apical and parasternal long axis views of the heart, Abdominal Upper Quadrant and Pelvic views, and lung scans.
The AI modules are provided as software components to be integrated by another computer programmer into their legally marketed ultrasound imaging device. Essentially, the Algorithm and API, which are modules, are medical device accessories.
Key features of the software are:
- Lung AI: An AI-assisted tool for suggesting the presence of lung structures and artifacts on ultrasound images, namely A-lines and B-lines.
- Cardiac AI: An AI-assisted tool for the quantification of Left Ventricular Ejection Fraction (LVEF), Myocardium wall thickness (Interventricular Septum (IVSd), Posterior wall (PWd)), and IVC diameter on cardiac ultrasound images.
- Quality AI: An AI tool designed to assess ultrasound per frame and per clip quality across Cardiac (A4C and PLAX), Lung, and Abdominal (Upper Quadrants and Pelvic) views.
Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided FDA 510(k) clearance letter excerpt for AI Platform 2.2 (AIP002):
1. Acceptance Criteria and Reported Device Performance
The acceptance criteria for the Quality AI functionality, specifically for abdominal views (upper quadrants and pelvic), focused on the agreement between the AI's quality rating and expert sonographers' ratings, as well as the ability of the AI to identify diagnostic quality scans.
| Acceptance Criteria Category | Specific Metric/Target | Reported Device Performance |
|---|---|---|
| Agreement (Retrospective Data) | Overall agreement (Interclass Correlation Coefficient - ICC) between Quality AI and experienced sonographers for frames. | ICC = 0.94 (95% CI: 0.94 – 0.95) |
| Agreement (Retrospective Data) | Overall agreement (ICC) between Quality AI and experienced sonographers for clips. | ICC = 0.95 (95% CI: 0.94 – 0.96) |
| Diagnostic Quality Identification (Clinical Use Case - AI Feedback to User) | Percentage of clips rated as ACEP quality of 3 or above by expert readers that also received at least "Minimum criteria met for diagnosis" image quality by Clip Quality AI. | 96.6% |
| Diagnostic Quality Identification (Clinical Use Case - AI Feedback to User) | Percentage of scans considered as "Minimal criteria met for diagnosis" or "good" by Quality AI that were also deemed diagnostic by experts (ACEP score of 3 or higher). | 96.1% |
2. Sample Size Used for the Test Set and Data Provenance
The provided text describes two distinct validation activities:
-
Retrospective Data Analysis:
- Sample Size: 200 clips (comprising 29,371 frames) from 184 patients.
- Data Provenance: The data was "previously acquired from various ultrasound devices and various abdominal pathologies." It encompassed "diverse demographic variables, including gender, age, and ethnicity from multiple clinical sites in metropolitan cities with diverse racial patient populations." The specific countries of origin are not explicitly stated, but "metropolitan cities with diverse racial patient populations" suggests a varied, likely multi-national, origin or at least from diverse populations within a country (e.g., USA). The data was retrospective as it was "previously acquired."
-
Prospective Clinical Use Case (Real-time Scanning):
- Sample Size: 186 abdomen scans.
- Data Provenance: Data was "acquired using the image and clip Quality AI in real time while scanning the pelvic and upper quadrant views of the abdomen." The details suggest this was prospective data collection specifically for validating the real-time feedback. Similar to the retrospective data, it encompassed diverse demographics and was acquired from different sites.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts
- Retrospective Data Analysis: "experienced sonographers" were used for quality rating on each frame and the entire clip. The exact number of sonographers is not specified. Their qualifications are described as "experienced sonographers."
- Prospective Clinical Use Case: "expert readers" were used to rate the ACEP quality of the clips and deem scans as diagnostic. The exact number of expert readers is not specified, but they rated clips based on ACEP quality (American College of Emergency Physicians), suggesting expertise in emergency ultrasound or similar fields.
4. Adjudication Method for the Test Set
The document does not explicitly state the adjudication method (e.g., 2+1, 3+1) used for establishing ground truth. It mentions "quality rating by experienced sonographers" and "experts" deeming scans diagnostic, implying expert consensus or individual expert judgments, but the process for resolving disagreements (if any) is not detailed.
5. If a Multi Reader Multi Case (MRMC) Comparative Effectiveness Study Was Done, If So, What Was the Effect Size of How Much Human Readers Improve with AI vs Without AI Assistance
No MRMC comparative effectiveness study was explicitly mentioned in the provided text, comparing human readers with and without AI assistance for improved performance. The studies described focus on the standalone performance of the AI in assessing image quality and agreement with human experts, and how the AI's real-time feedback correlates with expert-rated diagnostic quality, but not on improvement in human reader performance.
6. If a Standalone (i.e. Algorithm Only Without Human-in-the-Loop Performance) Was Done
Yes, a standalone performance assessment was conducted.
- The retrospective data analysis directly evaluates the "overall agreement between the Quality AI and quality rated by the experienced sonographers" (ICC of 0.94 for frames and 0.95 for clips). This is a measurement of the AI algorithm's performance in isolation against established ground truth.
- The AI Platform 2.2 provides "Quality Score feedback to assist healthcare professionals... while acquiring ultrasound images." The validation results showing the correlation between the Quality AI's assessment and expert-rated diagnostic quality (96.6% and 96.1%) reflect the standalone algorithm's ability to assess quality, which then serves as feedback.
7. The Type of Ground Truth Used
The ground truth for the validation studies was primarily established through expert consensus/judgment.
- For the retrospective analysis, it was "quality rating by experienced sonographers."
- For the prospective clinical use case, it involved "expert readers" who rated the ACEP quality and deemed scans diagnostic.
8. The Sample Size for the Training Set
The sample size for the training set is not provided in the given FDA 510(k) summary. The document explicitly states that the "test data was entirely separated from the training/tuning datasets and was not used for any part of the training/tuning," but it does not disclose the size of the training or tuning datasets themselves.
9. How the Ground Truth for the Training Set Was Established
The document states that the AI algorithms are "trained with clinical data." It is implied that this clinical data, used for training, would have had its ground truth established similarly to the test sets, likely through expert annotation or review by qualified medical professionals. However, the exact methodology for establishing ground truth for the training set is not detailed in this summary.
Ask a specific question about this device
(224 days)
AI4CMR software is designed to report cardiac function measurements (ventricle volumes, ejection fraction, indices, etc.) from 1.5T and 3T magnetic resonance (MR) scanners. AI4CMR uses artificial intelligence to automatically segment and quantify the different cardiac measurements. Its results are not intended to be used on a stand-alone basis for clinical decision-making. The user incorporating AI4CMR into their DICOM application of choice is responsible for implementing a user interface.
AI4CMR also supports clinical diagnostics by calculating flow measurements of vascular structures in 2D phase-encoded cardiac MR images.
AI4CMR v2.0 is a cloud-based solution designed to integrate to any third-party DICOM viewer application where the DICOM viewer serves as the user interface and the interface to a PACS or scanner for AI4CMR. AI4CMR is implemented as a plug-in to the DICOM viewer by the user and automatically processes and analyses cardiac MR images received by the DICOM viewer to quantify relevant cardiac function metrics as well as analytical flow quantification and makes the information available to the user at the user's discretion.
The following are the cardiac function metrics quantified and reported by the software:
Quantitative Analysis
The subject device performs the following:
Cardiac function measurements in cine sequences
- Anatomy and tissue segmentation
- LV/RV stroke volume
- LV/RV cardiac output
- LV/RV ejection fraction
- LV/RV end-diastolic volume
- LV/RV end-systolic volume
Analytical flow quantification in 2D Phase-Contrast Sequences (2DFlow)
- Total Forward/Backward Net Volumes
- Regurgitation Fraction
- Effective Volume per Minute
- Maximum Velocity
- Pressure Gradient
- Flow values
Reporting
The subject device enables the following metrics to be reported as desired by the user:
Cardiac function measurements
- LV/RV stroke volume
- LV/RV cardiac output
- LV/RV ejection fraction
- LV/RV end-diastolic volume
- LV/RV end-systolic volume
- LV myocardial mass
- LV/RV end-systolic frame
- LV/RV end-diastolic frame
- LV/RV end-systolic volume index
- LV/RV end-diastolic volume index
- LV/RV stroke volume index
- Myocardium mass index
- Cardiac index
Analytical flow measurements (2DFlow)
- Total Forward Volume
- Total Backward Volume
- Total Volume
- Total Net Positive Volume
- Total Net Negative Volume
- Regurgitation Fraction
- Volume per Minute
- Effective Volume per Minute
- Maximum Pressure Gradient
- Maximum Velocity
- Minimum Velocity
- Maximum Mean Velocity
- Maximum Flow
- Minimum Flow
Here's an analysis of the acceptance criteria and study detailed in the provided FDA 510(k) clearance letter for AI4CMR v2.0, structured according to your requirements:
Acceptance Criteria and Study Details for AI4CMR v2.0
1. Table of Acceptance Criteria and Reported Device Performance
The primary acceptance criterion mentioned explicitly for the AI-based 2D Flow Segmentation model is based on the Dice Similarity Coefficient (DSC) for vessel segmentation.
| Metric (Acceptance Criteria) | Acceptance Threshold | Reported Device Performance (Independent Test Set) |
|---|---|---|
| Segmentation (DSC) | > 0.70 | Ascending Aorta: 0.952 |
| Descending Aorta: 0.957 | ||
| Pulmonary Artery: 0.952 | ||
| Segmentation (Mean DSC) | Not explicitly stated, but implied by individual vessel thresholds | 0.857 (robustness against manual reference annotations) |
| Total Forward Volume (TFV) | No specific threshold provided for agreement with reference device | ICC: 0.95 (Agreement with predicate/reference device) |
| Total Backward Volume (TBV) | No specific threshold provided for agreement with reference device | ICC: 0.82 (Agreement with predicate/reference device) |
| Maximum Velocity (Vmax) | No specific threshold provided for agreement with reference device | ICC: 0.95 (Agreement with predicate/reference device) |
Note on ICC values: While high ICC values (e.g., 0.95) typically indicate excellent agreement, and 0.82 indicates substantial agreement, the document does not explicitly state acceptance thresholds for these metrics with the reference device. The statement "demonstrated consistent agreement" and supporting the claim of substantial equivalence implies these values met an internal, unstated acceptance level.
2. Sample Size Used for the Test Set and Data Provenance
- Sample Size for AI-based 2D Flow Segmentation (Independent Test Set): Approximately 15% of the subjects from the development dataset. The total development dataset comprised 167 cardiac MR cases from 61 adult subjects. While not directly stated, applying 15% to 61 subjects would yield roughly 9-10 subjects in the independent test set. The document also states that 296 vessel samples (ascending aorta, descending aorta, and pulmonary artery) were included in total, and these were stratified by vessel type and split at the subject level.
- Data Provenance: Retrospective clinical data from a tertiary Western European hospital.
- Data Acquisition: Images were acquired using standard cardiac MR 2D phase-contrast flow imaging protocols on 1.5T and 3T scanners. The dataset included both normal and pathological cases.
3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications
- Number of Experts: One expert reader.
- Qualifications: EACVI level III. (The European Association of Cardiovascular Imaging - EACVI - provides certification levels for expertise in cardiovascular imaging, with Level III indicating high-level expertise).
4. Adjudication Method for the Test Set
The document states, "An expert reader (EACVI level III) independently annotated all cases." This implies a single reader ground truth with no explicit adjudication method (e.g., 2+1 or 3+1). The text mentions "intra-reader reliability maintained by following established standards," suggesting the expert's consistency was ensured, but not through external adjudication.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
No MRMC comparative effectiveness study was mentioned. The study focused on the standalone performance of the AI model and its agreement with a legally marketed reference device, not on how human readers improve with AI assistance.
6. Standalone (Algorithm Only Without Human-in-the-Loop Performance) Study
Yes, a standalone performance study was done for the AI-based 2D Flow Segmentation model. The reported Dice Similarity Coefficient (DSC), Precision, and Recall values are all measures of the algorithm's standalone performance compared to the expert-derived ground truth.
Separately, the agreement of the device's flow measurements (TFV, TBV, Vmax) with a legally marketed reference device (cvi42) also reflects standalone algorithm performance against another established algorithm.
7. Type of Ground Truth Used
The ground truth used for the AI-based 2D Flow Segmentation was expert consensus / expert manual annotations. Specifically, one EACVI level III expert independently annotated all cases using standard segmentation guidelines.
8. Sample Size for the Training Set
The development dataset comprised 167 cardiac MR cases from 61 adult subjects. This dataset was split into training, validation, and independent test sets at a 70%/15%/15% ratio (of subjects). Therefore, the training set would include approximately 70% of 61 subjects, which is about 42-43 subjects. All vessels and repeated acquisitions from these subjects were assigned to the training set.
9. How the Ground Truth for the Training Set Was Established
The ground truth for the training set was established in the same manner as the ground truth for the test set: through the independent annotations of a single EACVI level III expert. The document states, "An expert reader (EACVI level III) independently annotated all cases using standard segmentation guidelines to ensure consistency and algorithm generalizability, with intra-reader reliability maintained by following established standards." This process applied to the entire dataset before it was split into training, validation, and test sets.
Ask a specific question about this device
(57 days)
MI View&GO is a medical diagnostic application for viewing, manipulation, quantification, analysis and comparison of medical images with one or more time-points. MI View&GO supports functional data, such as positron emission tomography (PET) or nuclear medicine (NM), as well as anatomical datasets, such as computed tomography (CT) or magnetic resonance (MR).
MI View&GO is intended to be utilized by appropriately trained health care professionals to aid in the management of diseases associated with oncology, cardiology, neurology, and organ function. The images and results produced by MI View&GO can also be used by the physician to aid in radiotherapy treatment planning.
MI View&GO is a software-only medical device which will be delivered in conjunction with Siemens SPECT/CT and PET/CT scanners. MI View&GO software provides additional specific capabilities for handling of PET and SPECT as well as CT and MR data directly at the acquisition console.
The MI View&GO software integrates molecular imaging more efficiently in the clinical environment by providing an interface for its users to review, post-process and read medical images immediately after acquisition. The purpose of the MI View&GO is to allow the technologist and reading physician to:
- Review acquired and reconstructed images at the scanner console
- Determine that the acquired data is of sufficient quality for reading, so the patient can be released.
- Prepare images for reading
- Perform a basic read
Here's an analysis of the acceptance criteria and study detailed in the provided FDA 510(k) clearance letter for MI View&GO, structured according to your requested points:
Acceptance Criteria and Device Performance Study for MI View&GO (K254016)
1. Acceptance Criteria and Reported Device Performance
| Acceptance Criteria Category | Specific Acceptance Criteria | Reported Device Performance |
|---|---|---|
| Improved Lung Segmentation (Auto Lung 3D) | For new organs (N/A for lung lobes, as they are existing organs with improved models) | Not applicable, as lung lobes are "improved organs," not "new organs." |
| For unchanged organs (other than lungs and lung lobes) | Dice-score on other organs (not retrained) remained unchanged and was verified by recalculating the Dice score with the new algorithm. | |
| For improved organs (Lung Lobes): Average Dice coefficient per organ shall be greater than or equal to the average Dice coefficient per organ of the predicate algorithm. | The average Dice coefficient for all 20 subjects was higher for each lobe in the subject device than in the predicate device. (Note: The document also states "although not greater than a +0.03 difference for all lobes," which clarifies that while improved, the improvement might not be substantial for every lobe.) | |
| Improved PERCIST Liver Algorithm (binary liver mask input) | Average Dice coefficient > 0.8 | The liver met this criterion. |
| Average Symmetric Surface Distance (ASSD) < 10 mm | The liver met this criterion. | |
| Improved PERCIST Liver Algorithm (Reference Region Placement) | N/A (Comparative analysis, not a specific criterion for a single metric) | Demonstrated to yield results in better agreement with semi-automatic evaluation by expert readers compared with the predicate method. |
| Improved PERCIST Liver Algorithm (Intersection with Suspicious Uptake Masks) | N/A (Comparative analysis, goal is fewer intersections) | Subject device had fewer intersections (4 cases) compared to the predicate device (13 cases) out of 129 subjects. |
2. Sample Size Used for the Test Set and Data Provenance
- Improved Lung Segmentation:
- Sample Size: 20 patients.
- Data Provenance:
- Retrospective.
- Half of the patients were new, and the other 50% were randomly selected from the predicate testing cohort.
- 50% of patients were from the US.
- All patients from Siemens Scanner.
- Improved PERCIST Liver Algorithm (binary liver mask input):
- Sample Size: 20 patients.
- Data Provenance:
- Patients obtained from clinical partners in Europe and USA.
- Randomly selected with stratification.
- All subjects from Siemens Scanner.
- Improved PERCIST Liver Algorithm (Reference Region Placement & Intersection with Suspicious Uptake Masks):
- Sample Size: 129 subjects for the "intersection with suspicious uptake masks" analysis.
- Data Provenance: Not explicitly stated for the "reference region placement" analysis, but implied to be from the same or similar source as the 129 subjects.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications
- Improved Lung Segmentation: Not explicitly stated. The ground truth for segmentation metrics (Dice, ASSD) is typically established by manual segmentation performed by experts, but the number of experts and their qualifications are not detailed in this document.
- Improved PERCIST Liver Algorithm (Reference Region Placement):
- Number of Experts: Two expert readers.
- Qualifications: "Expert readers" is mentioned, but specific qualifications (e.g., radiologist 10 years experience) are not provided.
- Improved PERCIST Liver Algorithm (Intersection with Suspicious Uptake Masks):
- Number of Experts: One expert reader.
- Qualifications: "Expert reader" is mentioned; specific qualifications are not provided.
4. Adjudication Method for the Test Set
- Improved Lung Segmentation: Not explicitly mentioned. For segmentation ground truth derived from multiple experts, methods like consensus or averaging are common, but not specified here.
- Improved PERCIST Liver Algorithm (Reference Region Placement): Semi-automatic evaluation by two expert readers. The document states the subject device algorithm was compared to this "reference standard," implying this semi-automatic output was considered the ground truth. No explicit adjudication method (like 2+1) is described for resolving differences between the two experts, if they occurred.
- Improved PERCIST Liver Algorithm (Intersection with Suspicious Uptake Masks): Identified by "an expert reader." This implies a single expert's identification served as the ground truth. No adjudication mentioned.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was Done
- No, a formal MRMC comparative effectiveness study involving human readers with and without AI assistance is not described in this document.
- The studies conducted focus on the algorithm's performance against historical data, expert interpretations, or comparing an improved algorithm to a predicate algorithm.
6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Study was Done
- Yes, standalone performance studies were conducted for specific features:
- Improved Lung Segmentation: The Dice coefficient and ASSD evaluation was a standalone algorithmic performance assessment against presumed expert-derived ground truth.
- Improved PERCIST Liver Algorithm (binary liver mask input): The Dice coefficient and ASSD evaluation for the liver mask was a standalone algorithmic performance assessment.
- Improved PERCIST Liver Algorithm (Reference Region Placement): The comparison of the algorithm's results to the semi-automatic evaluation by two expert readers is a standalone algorithm assessment, where the expert input constitutes the ground truth.
- Improved PERCIST Liver Algorithm (Intersection with Suspicious Uptake Masks): This was a standalone algorithmic evaluation of how often the algorithm's PERCIST VOIs intersected suspicious uptake areas identified by an expert.
7. The Type of Ground Truth Used
- Improved Lung Segmentation: Likely expert consensus/manual segmentation (implied by Dice coefficient and ASSD, which compare algorithm output to a gold standard segmentation).
- Improved PERCIST Liver Algorithm (binary liver mask input): Likely expert consensus/manual segmentation (implied by Dice coefficient and ASSD for the liver mask).
- Improved PERCIST Liver Algorithm (Reference Region Placement): Expert semi-automatic evaluation from two expert readers. These semi-automatic outputs were treated as the reference standard.
- Improved PERCIST Liver Algorithm (Intersection with Suspicious Uptake Masks): Expert identification of suspicious tracer uptake masks by a single expert reader.
8. The Sample Size for the Training Set
- Not explicitly stated in the document. The document mentions that the lung lobe segmentation algorithm was "re-trained with additional data" and that there was "No overlap of patients between training, tuning, and test cohorts," but does not provide details on the training set's size.
9. How the Ground Truth for the Training Set Was Established
- Not explicitly stated in the document. For machine learning models, ground truth for training data is typically established through expert labeling (e.g., manual segmentation, disease annotation), but the specifics are not provided here.
Ask a specific question about this device
(174 days)
Neurophet AQUA AD Plus is intended for automatic labeling, visualization, and volumetric quantification of segmentable brain structures and lesions, as well as SUVR quantification from a set of MR and PET images. Volumetric measurements may be compared to reference percentile data.
Neurophet AQUA AD Plus is a software device intended for the automatic labeling of brain structures, visualization, and volumetric quantification of segmented brain regions and lesions, as well as standardized uptake value ratio (SUVR) quantification using MR and PET images. The volumetric outcomes are compared to normative reference data to support the evaluation of neurodegeneration and cognitive impairment.
The device is designed to assist physicians in clinical evaluation by streamlining the clinical workflow from patient registration through image analysis, analysis result archiving, and report generation using software-based functionalities. The device provides percentile-based results by comparing an individual's imaging-derived quantitative analysis results to reference populations. Percentile-based results are provided for reference only and are not intended to serve as a standalone basis for diagnostic decision-making. Clinical interpretation must be performed by qualified healthcare professionals.
Here's a breakdown of the acceptance criteria and study details for the Neurophet AQUA AD Plus, based on the provided FDA 510(k) Clearance Letter:
Acceptance Criteria and Device Performance for Neurophet AQUA AD Plus
The Neurophet AQUA AD Plus employs multiple AI modules for automated segmentation and quantitative analysis of brain structures and lesions using MR and PET images. The device's performance was validated against predefined acceptance criteria for each module.
1. Table of Acceptance Criteria and Reported Device Performance
| AI Module | Performance Metric | Acceptance Criteria | Reported Device Performance |
|---|---|---|---|
| T1-SegEngine (T1-weighted structural MRI segmentation) | Accuracy (Dice Similarity Coefficient - DSC) | 95% CI of DSC: [0.750, 0.850] for major cortical brain structures 95% CI of DSC: [0.800, 0.900] for major subcortical brain structures | Cortical Regions: Mean DSC: 0.83 ± 0.04 (95% CI: 0.82–0.84) Subcortical Regions: Mean DSC: 0.87 ± 0.03 (95% CI: 0.86–0.88) |
| Reproducibility (Average Volume Difference Percentage - AVDP) | Equivalence range: 1.0–5.0% for both subcortical and cortical regions | Subcortical Regions: Mean AVDP: 2.50 ± 0.93% (95% CI: 2.26–2.74) Cortical Regions: Mean AVDP: 1.79 ± 0.74% (95% CI: 1.60–1.98) | |
| FLAIR-SegEngine (T2-FLAIR hyperintensity segmentation) | Accuracy (Dice Similarity Coefficient - DSC) | Mean DSC ≥ 0.80 | Mean DSC: 0.90 ± 0.04 (95% CI: 0.89–0.91) |
| Reproducibility (Mean AVDP and Absolute Lesion Volume Difference) | Absolute difference < 0.25 cc Mean AVDP < 2.5% | Mean AVDP: 0.99 ± 0.66% Mean absolute lesion volume difference: 0.08 ± 0.06 cc | |
| PET-Engine (SUVR and Centiloid quantification) | SUVR Accuracy (Intraclass Correlation Coefficient - ICC) | ICC ≥ 0.60 across Alzheimer's-relevant regions (compared to FDA-cleared reference product K221405) | ICC ≥ 0.993 across seven Alzheimer's-relevant regions |
| Centiloid Classification (Kappa value for amyloid positivity) | κ ≥ 0.70 (indicating substantial agreement with consensus expert visual reads) | Kappa values met or exceeded criterion (specific values not provided, but noted as meeting/exceeding) | |
| ED-SegEngine (edema-like T2-FLAIR hyperintensity segmentation) | Accuracy (Dice Similarity Coefficient - DSC) | DSC ≥ 0.70 | Mean DSC: 0.91 ± 0.09 (95% CI: 0.89–0.93) |
| HEM-SegEngine (GRE/SWI hypointense lesion segmentation) | Accuracy (F1-score / DSC) | F1-score ≥ 0.60 | Median F1-score (DSC): 0.860 (95% CI: 0.824–0.902) |
2. Sample Sizes and Data Provenance for the Test Set
- T1-SegEngine (Accuracy): 60 independent T1-weighted MRI cases. Data provenance not explicitly stated, but implicitly from public repositories (e.g., ADNI, AIBL, PPMI) and institutional clinical sites as mentioned for training data, and distinct from training.
- T1-SegEngine (Reproducibility): 60 subjects with paired T1-weighted scans (120 scans total). Data provenance not explicitly stated.
- FLAIR-SegEngine (Accuracy): 136 independent T2-FLAIR cases. Data provenance not explicitly stated, but distinct from training data.
- FLAIR-SegEngine (Reproducibility): Paired T2-FLAIR scans (number not specified). Data provenance not explicitly stated.
- PET-Engine (SUVR accuracy): 30 paired MRI–PET datasets. Data provenance not explicitly stated, but implicitly from multi-center studies including varied tracers and sites.
- PET-Engine (Centiloid classification): 176 paired T1-weighted MRI and amyloid PET scans from ADNI and AIBL. These are public repositories, likely involving diverse geographical data (e.g., USA, Australia). Data is retrospective.
- ED-SegEngine (Accuracy): 100 T2-FLAIR scans collected from U.S. and U.K. clinical sites. Data is retrospective.
- HEM-SegEngine (Accuracy): 106 GRE/SWI scans from U.S. clinical sites. Data is retrospective.
For all modules, validation datasets were fully independent from training datasets at the subject level, drawn from distinct sites and/or repositories where applicable.
The validation cohorts covered adult subjects across a broad age range (approximately 40–80+ years), with both females and males represented.
Racial/ethnic composition included White, Asian, Black, and African American subjects, depending on the underlying public and institutional datasets.
Clinical subgroups included clinically normal, mild cognitive impairment, and Alzheimer's disease for structural, FLAIR, and PET modules, and cerebrovascular/amyloid‑related pathologies for ED‑ and HEM‑SegEngines.
3. Number of Experts and Qualifications for Ground Truth
For structural and lesion segmentation modules (T1-, FLAIR-, ED-, HEM-SegEngines):
- Number of Experts: Not explicitly stated as a specific number, but "subspecialty-trained neuroradiologists" were used.
- Qualifications: "Subspecialty-trained neuroradiologists." Specific years of experience are not mentioned.
For Centiloid classification in the PET-Engine:
- Number of Experts: "Consensus expert visual reads." The exact number isn't specified, but implies multiple experts.
- Qualifications: "Experts" trained in established amyloid PET reading criteria. Specific qualifications beyond "expert" and training in criteria are not detailed.
4. Adjudication Method for the Test Set
For structural and lesion segmentation modules (T1-, FLAIR-, ED-, HEM-SegEngines):
- "Consensus/adjudication procedures and internal quality control to ensure consistency" were used for establishing reference segmentations. The specific 2+1, 3+1, or other detailed method is not provided.
For Centiloid classification in the PET-Engine:
- "Consensus expert visual interpretation" was used. The specific method details are not provided.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
The provided text does not indicate that an MRMC comparative effectiveness study was done to compare human readers with AI assistance versus without AI assistance. The performance studies primarily focus on the standalone (algorithm-only) performance of the device against expert-derived ground truth or a cleared reference product.
6. Standalone (Algorithm-Only) Performance Study
Yes, a standalone (algorithm only without human-in-the-loop performance) study was done for all AI modules. The text explicitly states: "Standalone performance tests were conducted for each module using validation datasets that were completely independent from those used for model development and training." The results presented in the table above reflect this standalone performance.
7. Type of Ground Truth Used
- Expert Consensus:
- For structural and lesion segmentation modules (T1-, FLAIR-, ED-, HEM-SegEngines), reference segmentations were generated by "subspecialty-trained neuroradiologists using predefined anatomical and lesion‑labeling criteria, with consensus/adjudication procedures."
- For Centiloid classification in the PET-Engine, reference labels were derived from "consensus expert visual interpretation using established amyloid PET reading criteria."
- Comparison to Cleared Reference Product:
- For SUVR quantification in the PET-Engine, reference values were obtained from an "FDA‑cleared reference product (K221405)" (Neurophet SCALE PET).
8. Sample Size for the Training Set
The exact sample size for the training set is not explicitly stated as a single number. However, the document mentions:
- "The AI-based modules (T1‑SegEngine, FLAIR‑SegEngine, PET‑Engine, ED‑SegEngine, HEM‑SegEngine) were trained using multi-center MRI and PET datasets collected from public repositories (e.g., ADNI, AIBL, PPMI) and institutional clinical sites."
- "Training data covered:
- Adult subjects across a broad age range (approximately 20–80+ years), with both sexes represented and including multiple racial/ethnic groups (e.g., White, Asian, Black).
- A spectrum of clinical conditions relevant to the intended use, including clinically normal, mild cognitive impairment, and Alzheimer's disease, as well as patients with cerebrovascular and amyloid‑related pathologies for lesion-segmentation modules.
- MRI acquired on major vendor platforms (GE, Siemens, Philips) at 1.5T and 3T... and amyloid PET acquired on multiple PET systems with commonly used tracers (Amyvid, Neuraceq, Vizamyl)."
This indicates a large and diverse training set, although a precise count of subjects or images isn't provided.
9. How the Ground Truth for the Training Set Was Established
The document implies that the training data included "manual labels" as it states: "No images or manual labels from the training datasets were reused in the validation datasets." However, it does not explicitly detail the process by which these "manual labels" or ground truth for the training set were established (e.g., number of experts, qualifications, adjudication method for training data). It's reasonable to infer that similar expert-driven processes were likely used for training ground truth as for validation, but this is not explicitly confirmed in the provided text.
Ask a specific question about this device
Page 1 of 28