Search Results

AI-Rad Companion Brain MR VA50 is an enhancement to the predicate, AI-Rad Companion Brain MR VA40 (K213706). Just as in the predicate, the brain morphometry feature of AI-Rad Companion Brain MR addresses the automatic quantification and visual assessment of the volumetric properties of various brain structures based on T1 MPRAGE datasets. From a predefined list of brain structures (e.g. Hippocampus, Caudate, Left Frontal Gray Matter, etc.) volumetric properties are calculated as absolute and normalized volumes with respect to the total intracranial volume. The normalized values are compared against age-matched mean and standard deviations obtained from a population of healthy reference subjects. The deviation from this reference population can be visualized as 3D overlay map or out-of-range flag next to the quantitative values.

Additionally, identical to the predicate, the white matter hyperintensities feature addresses the automatic quantification and visual assessment of white matter hyperintensities on the basis of T1 MPRAGE and T2 weighted FLAIR datasets. The detected WMH can be visualized as a 3D overlay map and the quantification in count and volume as per 4 brain regions in the report.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study details for the AI-Rad Companion Brain MR device, based on the provided FDA 510(k) summary:

Acceptance Criteria and Device Performance

Metric	Acceptance Criteria	Reported Performance (AVG)	95% CI	Standard Deviation (STD)
Volumetric Segmentation Accuracy	PCC >= 0.77	0.94 PCCC	[0.83, 0.98]	n.a.
Voxel-wise Segmentation Accuracy	Mean Dice score >= 0.47	0.50 Dice	[0.42, 0.57]	0.22
WMH Change Region-wise Segmentation Accuracy	Median F1-score >= 0.69	0.69 F1-score	[0.633, 0.733]	0.13

Study Details

Sample Size and Data Provenance:
- Test Set Sample Size: 75 subjects / 150 studies (2 scans per subject).
- Data Provenance: The data originate from a mix of retrospective and potentially prospective sources, from both the US and Europe:
  - UPenn (US): 15 subjects
  - ADNI (US): 15 subjects
  - Lausanne (EU): 22 subjects
  - Prague (EU): 23 subjects
  - Medical Indication: 60 Multiple Sclerosis (MS) patients, 15 Alzheimer's (AD) patients.
  - Age Range: 25-88 years.
  - Gender Distribution: 56 females, 19 males.
  - Scanner Info: Siemens 3.0T MR scanners, T1w MPRAGE and T2w FLAIR scan protocols.
Number of Experts and Qualifications for Ground Truth:
- The document states that for each dataset, three sets of ground truth were manually annotated. Each set was annotated by a "disjoint group of annotator, reviewer, and clinical expert."
- For the initial annotation and review, "in-house annotators" and "in-house reviewers" were used.
- For final review and correction, a "clinical expert" was used, randomly assigned per case to minimize bias.
- Specific qualifications (e.g., years of experience, board certification) for these experts are not explicitly stated in the provided text, beyond being "clinical experts."
Adjudication Method for Test Set:
- The ground truth process involved a multi-step adjudication. For each test dataset:
  1. Three initial annotations by three different in-house annotators.
  2. Each initial annotation was reviewed by an in-house reviewer.
  3. Each initial annotation (after in-house review) was reviewed by a reference clinical expert.
  4. If corrections by the clinical expert were "significant and time-consuming," they were communicated back to the annotator for correction and then re-reviewed.
- This resembles a form of iterative consensus building and expert adjudication, where multiple initial annotations are refined through reviewer and expert input, rather than a strict N+1 or N+M voting system for final ground truth, though the final decision appears to rest with the clinical expert.
Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study:
- No MRMC study was done. The document explicitly states: "The predicate (K213706) was not validated using clinical tests and therefore no clinical tests were conducted to test the performance and functionality of the modifications introduced within AI-Rad Companion Brain MR."
- The validation focused on standalone algorithmic performance compared to expert-established ground truth and comparison against a "reference device" (icobrain) using equivalent validation methodology for the WMH follow-up feature.
- Therefore, there's no reported effect size of human readers improving with AI vs. without AI assistance.
Standalone (Algorithm Only) Performance Study:
- Yes, a standalone performance study was conducted for the WMH follow-up feature. The acceptance criteria and reported performance metrics (PCC, Dice, F1-score) are for the algorithm's performance against the established ground truth.
Type of Ground Truth Used:
- The ground truth for the White Matter Hyperintensities (WMH) Follow-Up Feature was established through expert consensus and manual annotation. It involved a "disjoint group of annotator, reviewer, and clinical expert" for each ground truth dataset. The clinical expert performed the final review and correction.
Training Set Sample Size:
- The document states: "The training data used for the fine tuning the hyper parameters of WMH follow-up algorithm is independent of the data used to test the white matter hyperintensity algorithm follow up algorithm."
- However, the specific sample size for the training set is not provided in the given text. It mentions independent training data but does not quantify it.
How Ground Truth for Training Set was Established:
- The document mentions that training data was used for "fine tuning the hyper parameters." While it implies that the training data would also require ground truth, the method for establishing ground truth for the training set is not explicitly described in the provided text. It only states that the training data was "independent" of the test data. Given the "WMH follow-up algorithm does not include any machine learning component," the type of "training" might refer to calibration or rule optimization rather than machine learning model training in the conventional sense, and subsequently, how ground truth for that calibration was established is not detailed.

Ask a Question

Ask a specific question about this device

K Number

K223180

Device Name

AIRAscore

Manufacturer

AIRAmed GmbH

Date Cleared

2023-08-25

(318 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K192130

Predicate For

N/A

Intended Use

AIRAscore is intended for automatic labeling, visualization and volumetric quantification of segmentable brain structures from a set of MR images. This software is intended to automate the current manual process of identifying, labeling and quantifying the volume of segmentable brain structures identified on MR images.

Device Description

AIRAscore is a software that offers automatic, fast and reliable segmentation of brain volumes into gray matter, white matter, cerebrospinal fluid and, if present, white matter lesions with an additional classification of tissue anatomy. The AIRAscore software comprises two functions, referred to as "AIRAscore structure" and "AIRAscore MS". The report created using the AIRAscore structure function contains the volume evaluation for each seqmented anatomical area with the raw value, the relative value with respect to the total intracranial volume, and the percentile for the patient compared to a reference set. It furthermore provides a quick overview of potential segment size differences based on the reference set comparison. If the AIRAscore MS report is requested, it is provided with additional information about the number and the volume of white matter lesions and their categorization (i.e., juxtacortical, periventricular or infratentorial). For analysis with AIRAscore, incoming MRI data need to comply with the DICOM standard and are checked to fulfill the technical requirements. After successful verification, segmentation is performed using specialized neuronal networks that remain static during the lifetime of a software version. The results are then corrected for head size and compared to an age- and sex adjusted reference collective including a statistical classification. A report is generated and transmitted via a DICOM storage SCU (sender) to a defined DICOM storage SCP (usually the picture archive of the referring physician) using the DICOM format.

AI/ML Overview

The provided FDA 510(k) summary for AIRAscore does not contain a detailed description of the acceptance criteria and the study that rigorously proves the device meets those criteria, specifically regarding its clinical performance or accuracy for volumetric quantification. The document focuses on general software verification and validation, comparison to a predicate device, and compliance with standards, but it lacks specific performance testing results (e.g., accuracy, precision, sensitivity, specificity, Dice scores) against a defined ground truth.

The "Performance Testing" section states: "The validation confirmed that AIRAscore performs well across target patient population and scanner manufacturers." However, it does not provide what performance metrics were used, what the acceptance criteria for "performing well" were, or what the actual results were.

Therefore, I cannot populate all the requested information. Below is what can be inferred or stated as missing based solely on the provided text.

Acceptance Criteria and Study to Prove Device Meets Acceptance Criteria

The provided 510(k) summary for AIRAscore does not explicitly define specific quantitative acceptance criteria for its performance (e.g., accuracy of volumetric quantification) or present a detailed study proving these criteria were met. The document focuses on general software verification and validation, comparison to a predicate device, and compliance with general software/medical device standards.

The "Performance Testing" section broadly states that "The validation confirmed that AIRAscore performs well across target patient population and scanner manufacturers." However, it does not specify the metrics, thresholds for "performing well," or the results of this validation.

1. Table of Acceptance Criteria and Reported Device Performance

Based on the provided document, specific quantitative acceptance criteria and corresponding reported device performance metrics (e.g., accuracy, precision, correlation coefficients, Dice scores for segmentation) are NOT detailed.

The document states:

Performance Measurement Testing (for New Device - AIRAscore):
- Accuracy: "Brain segmentable structure volumes / volume changes compared to manually labeled ground truth"
- Reproducibility: "Brain segmentable structure volumes / volume changes compared on test-retest images"

However, the specific acceptance thresholds for these measurements (e.g., "accuracy > X%", "Dice coefficient > Y") and the actual numerical results that demonstrate the device met these criteria are not included in this summary.

2. Sample Size and Data Provenance for Test Set

Sample Size for Test Set: Not specified in the provided document.
Data Provenance: The document states "The validation confirmed that AIRAscore performs well across target patient population and scanner manufacturers." This broadly implies use of diverse data, but specific details on country of origin or whether the data was retrospective or prospective are not provided.

3. Number of Experts and Qualifications for Ground Truth Establishment

Number of Experts: Not specified in the provided document.
Qualifications of Experts: The type of study described (comparison to "manually labeled ground truth") implies expert involvement, but their qualifications (e.g., specific medical specialties, years of experience, board certification) are not detailed.

4. Adjudication Method for the Test Set

Adjudication Method: Not specified in the provided document.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

MRMC Study: The document describes "performance measurement testing" including "Accuracy" and "Reproducibility" comparing to "manually labeled ground truth." However, it does not mention a multi-reader multi-case (MRMC) comparative effectiveness study evaluating how much human readers improve with AI vs. without AI assistance. The device's intended use is to "automate the current manual process," suggesting a focus on automation rather than AI-assisted human reading improvement.

6. Standalone (Algorithm Only) Performance

Standalone Performance: The description of "Performance Measurement Testing" (Accuracy relative to manually labeled ground truth, Reproducibility) suggests that the device's standalone performance (algorithm only without human-in-the-loop) was assessed. However, the specific metrics and results of this standalone assessment are not provided.

7. Type of Ground Truth Used

Type of Ground Truth: "Manually labeled ground truth" is explicitly mentioned for accuracy measurement. The specific methodology for this manual labeling (e.g., expert consensus, pathology, long-term outcomes data) is not further detailed. Given the context of "segmentable brain structures" and "volumetric quantification," it is highly probable that this refers to expert-driven manual segmentation or volumetric measurements on the MR images.

8. Sample Size for the Training Set

Sample Size for Training Set: Not specified in the provided document. The document mentions "specialized neuronal networks" and "machine learning (supervised voxel classification by a Convolutional Neuronal Network)" for segmentation, which implies a training set was used, but its size is not given.

9. How Ground Truth for Training Set Was Established

Ground Truth for Training Set: The document mentions "supervised voxel classification by a Convolutional Neuronal Network." For supervised learning, the ground truth for the training set would typically be established through expert annotations (e.g., manual segmentation/labeling of brain structures on MR images). However, the specific methodology and expert involvement for establishing the training set ground truth are not described in the provided text.

Ask a Question

Ask a specific question about this device

K Number

K213706

Device Name

AI-Rad Companion Brain MR

Manufacturer

Siemens Healthcare GmBh

Date Cleared

2022-04-15

(142 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K192130,K193290

Predicate For

K232305,K240680

Intended Use

AI-Rad Companion Brain MR is a post-processing image analysis software that assists clinicians in viewing, analyzing, and evaluating MR brain images.

Al-Rad Companion Brain MR provides the following functionalities:

Automated segmentation and quantitative analysis of includual brain structures and white matter hyperintensities
Quantitative comparison of brain structure with normative data from a healthy population
Presentation of results of reporting that includes all numerical values as well as visualization of these results

Device Description

AI-Rad Companion Brain MR VA40 is an enhancement to the predicate. AI-Rad Companion Brain MR VA20 (K193290). Just as in the predicate, AI-Rad Companion Brain MR addresses the automatic quantification and visual assessment of the volumetric properties of various brain structures based on T1 MPRAGE datasets. In AI-Rad Companion Brain MR VA40, the quantification and visual assessment extends to white matter hyperintensities on the basis of T1 MPRAGE and T2 weighted FLAIR datasets. These datasets are acquired as part of a typical head MR acquisition. The results are directly archived in PACS as this is the standard location for reading by radiologist. From a predefined list of 30 structures (e.g. Hippocampus, Left Frontal Grey Matter, etc.), volumetric properties are calculated as absolute and normalized volumes with respect to the total intercranial volume. The normalized values for a given patient are compared against age-matched mean and standard deviations obtained from a population of healthy reference subjects.

The white matter hyperintensities can be visualized as a 3D overlay map and the quantification in count and volume as per 4 brain regions in the report.

As an update to the previously cleared device, the following modifications have been made:

1. Modified Intended Use Statement
1. Addition of white matter hyperintensities overlay map, count and volume as per 4 brain regions
1. Enhanced DICOM Structured Report (DICOM SR)
1. Updated deployment structure

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:

Acceptance Criteria and Device Performance

1. A table of acceptance criteria and the reported device performance:

Validation Type	Acceptance Criteria	Reported Device Performance AVG	Reported Device Performance 95% CI
Volumetric Segmentation Accuracy (PCC)	PCC 95% Confidence Interval includes 0.91	0.98	[0.97, 0.99]
Volumetric Segmentation Accuracy (ICC)	ICC 95% Confidence Interval includes 0.95	0.97	[0.96, 0.98]
Voxel-wise Segmentation Accuracy	Mean Dice score >= 0.58	0.60	[0.53, 0.63]
WMH Lesion-wise Segmentation Accuracy	Mean F1-score >= 0.57	0.60	[0.57, 0.64]
Reproducibility	Lower Bound of the 95% Bootstrap CI Dice >= 0.63	0.79	[0.77, 0.81]

All reported device performance metrics meet or exceed the specified acceptance criteria, as their 95% Confidence Intervals either include the criterion or are entirely above it (for metrics requiring a minimum value).

Study Details

2. Sample size used for the test set and the data provenance:

Test Set Sample Size: 64 subjects for the main testing cohort, and 25 subjects for the reproducibility cohort.
- Total Subjects: 89 subjects (64 + 25)
- Total Studies: 164 studies (64 for testing cohort, and 100 for reproducibility cohort)
Data Provenance (Country of Origin): United States (Cleveland, Baltimore, New York, ADNI), Switzerland (Lausanne, CLEMENS), France (Montpellier).
Retrospective or Prospective: The text does not explicitly state whether the data was retrospective or prospective, but the description of "test data" and "training data" suggests retrospective data collection from existing datasets.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:

Number of Experts: Three distinct groups involved in establishing ground truth for each dataset: an annotator, a reviewer, and a clinical expert. The text implies a total of three individuals per case, forming a disjoint group (i.e., three different people for annotation, review, and expert correction).
Qualifications of Experts: The text refers to them as "in-house annotators," "in-house reviewer," and "referred clinical expert." Specific qualifications (e.g., years of experience, board certification) are not explicitly detailed beyond the "clinical expert" designation.

4. Adjudication method for the test set:

Adjudication Method: A multi-step process: "For each dataset, three sets of white matter hyperintensity ground truth are annotated manually. Each set is annotated by a disjoint group of annotator, reviewer, and clinical expert with the expert randomly assigned per case to minimize annotation bias. For each test dataset, the three initial annotations are annotated by three different in-house annotators. Then, each initial annotation is reviewed by the in-house reviewer. Afterwards, each initial annotation is reviewed by the referred clinical expert. The clinical expert reviews and corrects the initial annotation of the WMH according to the annotation protocol."
- This is a form of cascading/sequential review and consensus, rather than a direct voting or "X+Y" adjudication, with the clinical expert performing the final review and correction.

5. If a multi-reader multi-case (MRMC) comparative effectiveness study was done:

No, an MRMC comparative effectiveness study was not done. The study focuses on the standalone performance of the AI algorithm against expert-established ground truth, not on how human readers' performance improves with or without AI assistance.

6. If a standalone (i.e., algorithm only without human-in-the-loop performance) was done:

Yes, a standalone performance study was done. The testing validated the "AI-Rad Companion Brain MR WMH" algorithm's performance by comparing its outputs directly to manually annotated ground truth. The results table explicitly presents the "Volumetric Segmentation Accuracy," "Voxel-wise Segmentation Accuracy," "WMH Lesion-wise Segmentation Accuracy," and "Reproducibility" of the device.

7. The type of ground truth used:

Expert Consensus / Expert-Annotated Ground Truth. The ground truth was established through a multi-step manual annotation, review, and correction process by "in-house annotators," "in-house reviewer," and a "clinical expert."

8. The sample size for the training set:

The sample size for the training set is not explicitly stated. The text only mentions: "The training data used for the training of the White matter hyperintensity algorithm is independent of the data used to test the white matter hyperintensity algorithm."

9. How the ground truth for the training set was established:

The text does not provide details on how the ground truth for the training set was established. It only ensures that the training data and testing data are independent.

Ask a Question

Ask a specific question about this device

Page 1 of 1