Search Results

RCTUEA is intended to aid in assessing whether a premenopausal woman who presents with an ovarian adnexal mass is at high or low likelihood of finding malignancy on surgery. RCTUEA is indicated for women who meet the following criteria: over age 18; ovarian adnexal mass present for which surgery is planned, and not yet referred to an oncologist. RCTUEA must be interpreted in conjunction with an independent clinical and radiological assessment. The test is not intended as a screening or stand-alone diagnostic assay.

The electrochemiluminescence immunoassay "ECLIA" is intended for use on Elecsys and cobas e immunoassay analyzers.

PRECAUTION: RCTUEA should not be used without an independent clinical/ radiological evaluation and is NOT intended to be a screening test or to determine whether a patient should proceed to surgery. Incorrect use of RCTUEA carries the risk of unnecessary testing, surgery and/or delayed diagnosis.

Device Description

ROMA Calculation Tool Using Elecsys Assays (RCTUEA) is a qualitative test for serum and plasma (K2-EDTA, K3-EDTA and Li-Heparin) that combines the results of the Elecsys HE4 assay, Elecsys CA 125 II assay and menopausal status into a numerical score. ROMA was developed using separate logistic regression equations for premenopausal and postmenopausal women:

Pre menopausal:

Predictive Index (PI) = - 12.0 + 2.38 x LN[HE4] + 0.0626 x LN[CA 125] Post menopausal:

Predictive Index (PI) = - 8.09 + 1.04 x LN[HE4] + 0.732 x LN[CA 125] RCTUEA value = exp (PI) / [1 + exp(PI)b)] x 10

RCTUEA is used to stratify women into likelihood groups for finding cancer on surgery. In order to provide a specificity level of 75 %, a cutoff point of ≥ 1.14 was used for premenopausal women and ≥ 2.99 was used for postmenopausal women who present with an ovarian adnexal mass. Women with RCTUEA results above these cutoff points are at high likelihood of finding malignancy on surgery.

The immunoassays used in RCTUEA are:

Elecsys HE4: an electrochemiluminescence immunoassay for the quantitative determination of HE4 in human serum and plasma

Elecsys CA 125 II: an electrochemiluminescence immunoassay for the quantitative determination of OC 125 reactive determinants in human serum and plasma.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study information for the ROMA Calculation Tool Using Elecsys Assays (RCTUEA), based on the provided document:

1. Table of Acceptance Criteria and Reported Device Performance

The acceptance criteria are implied by the performance goals for sensitivity, specificity, and predictive values. The study reports these values for different cohorts. The primary focus for the device's utility is its ability to stratify women into high or low likelihood groups for finding malignancy on surgery, especially its adjunctive use with Initial Cancer Risk Assessment (ICRA).

Metric	Premenopausal (EOC Only) (Target/Achieved)	Postmenopausal (EOC Only) (Target/Achieved)	All Cancers & LMP Tumors (Adjunctive Use) (Target/Achieved)	Notes
Sensitivity	100% (9/9) (66.4% - 100% CI)	89.5% (34/38) (75.2% - 97.1% CI)	Increased from 76.9% to 90.8% (81.0% - 96.5% CI)	The document states RTCUEA was used to "aid in assessing whether a premenopausal woman...is at high or low likelihood of finding malignancy... In order to provide a specificity level of 75%". This implicitly sets the specificity as the primary acceptance criterion, with other metrics assessed relative to this. For standalone RCTUEA, the specificity was targeted at 75%. For adjunctive use, the goal appeared to be demonstrating a statistically significant increase in NPV and an increase in sensitivity, even if accompanied by a decrease in specificity and PPV. The significant increase in NPV (P=0.0000) for adjunctive use is a key finding supporting its effectiveness in ruling out cancer.
Specificity	77.6% (177/228) (71.7% - 82.9% CI)	82.5% (118/143) (75.3% - 88.4% CI)	Decreased from 84.4% to 70.4% (65.4% - 75.0% CI)
Positive Predictive Value (PPV)	15.0% (9/60) (7.1% - 26.6% CI)	57.6% (34/59) (44.1% - 70.4% CI)	Decreased from 46.3% to 34.9% (27.8% - 42.6% CI)
Negative Predictive Value (NPV)	100.0% (177/177) (97.9% - 100% CI)	96.7% (118/122) (91.8% - 99.1% CI)	Increased from 95.4% to 97.8% (95.2% - 99.2% CI)
TP-FP	77.6% (72.1% - 83.2% CI)	72.0% (60.2% - 83.8% CI)	61.1% (52.5% - 69.7% CI)

2. Sample Size and Data Provenance

Test Set Sample Size:
- Total Evaluated: 455 women
- Premenopausal (Primary EOC analysis): 237
- Postmenopausal (Primary EOC analysis): 181
- Adjunctive Use Analysis: 436 (65 Malignant, 371 No Malignancy by Pathology)
Data Provenance: Prospective, multi-center clinical trial. The document does not specify the countries of origin for the data, but it mentions samples were tested at "three US testing sites," which implies at least some data is from the US. The "multi-center" nature suggests other locations as well, but these are not named.

3. Number of Experts and Qualifications for Ground Truth

The document does not specify the number of experts used to establish the ground truth for the test set.
Qualifications of Experts: The ground truth was based on "histopathology reports collected after surgery." This implies pathologists are the experts, but their specific qualifications (e.g., years of experience, board certification) are not provided. An "Initial Cancer Risk Assessment (ICRA)" was completed by a "nurse practitioner, physician assistant or a non-gynecological oncologist," but this was a baseline assessment, not the definitive ground truth.

4. Adjudication Method for the Test Set

The document does not describe a specific adjudication method (e.g., 2+1, 3+1, none) for resolving discrepancies in the ground truth. It simply states that "histopathology reports were collected after surgery," implying the single report served as the definitive ground truth.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

The document describes a study comparing the performance of the RCTUEA alone versus the performance of an Initial Cancer Risk Assessment (ICRA) alone, and then versus the adjunctive use of RCTUEA with ICRA.
Effect Size of Human Improvement with AI vs. without AI assistance:
- This is not a traditional MRMC study where human readers interpret cases with and without AI assistance to measure reader improvement. Instead, it compares a clinical assessment (ICRA, performed by clinicians) to the device's output, and then the combination.
- For "all malignancies including EOC, non-epithelial ovarian cancer, other gynecologic and non-gynecologic cancers," when RCTUEA was used adjunctively with ICRA:
  - Sensitivity increased from 76.9% (ICRA alone) to 90.8% (ICRA + RCTUEA). This represents an increase of 13.9 percentage points.
  - Specificity decreased from 84.4% (ICRA alone) to 70.4% (ICRA + RCTUEA). This represents a decrease of 14 percentage points.
  - NPV increased from 95.4% (ICRA alone) to 97.8% (ICRA + RCTUEA), which was statistically significant (P=0.0000). This indicates an improvement in the ability to rule out cancer.

6. Standalone Performance Study (Algorithm Only)

Yes, a standalone performance study was done. The performance metrics for RCTUEA alone (without ICRA) are reported for both premenopausal and postmenopausal women with Epithelial Ovarian Cancer (EOC) only.
- Premenopausal (EOC): Sensitivity 100%, Specificity 77.6%
- Postmenopausal (EOC): Sensitivity 89.5%, Specificity 82.5%
Performance of RCTUEA for "all malignancies and LMP tumors" is also shown:
- Sensitivity: 86.2%
- Specificity: 79.5%

7. Type of Ground Truth Used

Pathology: The primary and definitive ground truth used was "histopathology reports collected after surgery."

8. Sample Size for the Training Set

The document does not explicitly state the sample size used for the training set. It describes the RCTUEA calculation (logistic regression equations) as "developed using separate logistic regression equations," but does not provide details about the data used for this development phase. The reported clinical trial is for effectiveness determination (a test set).

9. How the Ground Truth for the Training Set Was Established

The document states that the ROMA algorithm was "developed using separate logistic regression equations." However, it does not provide information on how the ground truth for the training data used to develop these equations was established. It is reasonable to assume it would also have been based on histopathology, but this is not explicitly stated for the training phase.

Ask a Question

Ask a specific question about this device

K Number

K160090

Device Name

Lumipulse G ROMA

Manufacturer

Fujirebio Diagnostics, Inc.

Date Cleared

2016-05-16

(122 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

k151378,k142895,K103358

Intended Use

Lumipulse® G Risk of Ovarian Malignancy Algorithm (ROMA®) is a qualitative serum and plasma (lithium heparin or dipotassium EDTA) test that combines the results of Lumipulse G CA 1251 and menopausal status into a numerical score.

Lumipulse G ROMA is intended to aid in assessing whether a premenopausal woman who presents with an ovarian adnexal mass is at high or low likelihood of finding malignancy on surgery. Lumipulse G ROMA is indicated for women who meet the following criteria: over age 18; ovarian adnexal mass present for which surgery is planned, and not yet referred to an oncologist. Lumipulse G ROMA must be interpreted in conjunction with an independent clinical and radiological assessment. The test is not intended as a screening or stand-alone diagnostic assay.

PRECAUTION: Lumipulse G ROMA should not be used without an independent clinical /radiological evaluation and is not intended to be a screening test or to determine whether a patient should proceed to surgery. Incorrect use of Lumipulse G ROMA carries the risk of unnecessary testing, surgery, and/or delayed diagnosis.

Device Description

Lumipulse GROMA is a qualitative serum and plasma test that combines the results of 2 analytes, HE4 (Lumipulse G HE4) and CA125 (Lumipulse G CA 125 II) and menopausal status into a numerical score between 0.00 and 10.00. The premenopausal status must be based on ovarian function determined with information available from clinical evaluation and medical history.

The test system consists of Lumipulse G HE4, Lumipulse G CA 125 II, the Lumipulse G ROMA Calculator Tool and the LUMIPULSE G1200 System. The LUMIPULSE G1200 System is not capable of calculating the ROMA score. The immunoassays are performed according to the directions detailed in each product insert.

Both Lumipulse G HE4 and Lumipulse G CA 125 Il are previously 510(k) cleared Class II devices (K151378 and K142895 respectively). The Lumipulse G HE4 assay is a chemiluminescent enzyme immunoassay (CLEIA) for the quantitative determination of HE4 antigen in human serum and plasma (lithium heparin or dipotassium EDTA) on the LUMIPULSE G System. The assay is to be used as an aid in monitoring recurrence or progressive disease in patients with epithelial ovarian cancer. Serial testing for patient HE4 assay values should be used in conjunction with other clinical methods used for monitoring ovarian cancer. Lumipulse G CA 125 II assay is a chemiluminescent enzyme immunoassay (CLEIA) for the quantitative determination of CA125 in human serum and plasma (sodium heparin, lithium heparin, or dipotassium EDTA) on the LUMPULSE G System. The assay is to be used as an aid in monitoring recurrence or progressive disease in patients with ovarian cancer. Serial testing for patient CA125 assay values should be used in conjunction with other clinical methods used for monitoring ovarian cancer.

Lumipulse G ROMA scores (numerical score from 0.00 -10.00) for both premenopausal and postmenopausal women are calculated using the Lumipulse G ROMA Calculator Tool to indicate a low likelihood or high likelihood for finding malignancy on surgery using the value of the 2 immunoassays (Lumipulse G HE4 and Lumipulse G CA125II).

AI/ML Overview

Here’s a summary of the acceptance criteria and the study proving the device meets them, based on the provided FDA 510(k) submission for Lumipulse G ROMA:

1. Table of Acceptance Criteria and Reported Device Performance

The submission does not explicitly state "acceptance criteria" in a quantitative manner for clinical performance in the way usually seen for AI/ML devices (e.g., minimum sensitivity or specificity targets). Instead, it demonstrates substantial equivalence to a predicate device and provides performance metrics (sensitivity, specificity, PPV, NPV) for direct disease detection and adjunctive use with Initial Cancer Risk Assessment (ICRA).

Given that the purpose of the submission is to demonstrate "substantial equivalence" to a predicate device, the implied acceptance criterion for clinical performance is that the Lumipulse G ROMA's performance should be comparable or non-inferior to the predicate device and demonstrate utility for its intended use. For analytical performance, the acceptance criteria are typically met by demonstrating acceptable precision, linearity, analytical specificity, and method comparison to the predicate. The clinical study results presented are the "reported device performance."

Category	Acceptance Criteria (Implied / Demonstrated)	Reported Device Performance (Lumipulse G ROMA)
Clinical Performance	Substantial Equivalence to Predicate (ROMA (HE4 EIA + ARCHITECT CA 125 II)) and Utility for Intended Use: To demonstrate aid in assessing high or low likelihood of malignancy in ovarian adnexal mass for pre/postmenopausal women, with acceptable sensitivity, specificity, PPV, and NPV.	For Stratification into High/Low Likelihood of Malignancy (EOC only):
Premenopausal: Sensitivity 100.0% (9/9), Specificity 74.9% (167/223), PPV 13.8% (9/65), NPV 100.0% (167/167)
Postmenopausal: Sensitivity 92.1% (35/38), Specificity 77.6% (111/143), PPV 52.2% (35/67), NPV 97.4% (111/114)

Adjunctive Use with ICRA (All Malignancies & LMP):
Combined Premenopausal & Postmenopausal: Sensitivity 88.1%, Specificity 67.5%, PPV 38.3%, NPV 96.1%
(Statistically significant improvement in NPV from 93.1% (ICRA alone) to 96.1% (Adjunctive)). |
| Method Comparison | Strong correlation with the predicate device for both premenopausal and postmenopausal women. | Premenopausal Women (n=53): Correlation Coefficient (r) = 0.9977, Intercept (-0.004), Slope (1.005)
Postmenopausal Women (n=115): Correlation Coefficient (r) = 0.9953, Intercept (-0.103), Slope (0.999) |
| Matrix Comparison | Equivalence between serum and K2 EDTA plasma samples. | Premenopausal (n=86): y= 1.001(x) - 0.072; r=0.9983
Postmenopausal (n=86): y= 1.004(x) - 0.058; r=0.9988 |
| Precision (Lot-to-Lot)| Acceptable %CV for ROMA scores across different lots. | Overall Total %CV for Premenopausal ROMA: 4.6% (Panel 1) to 0.0% (Panel 6)
Overall Total %CV for Postmenopausal ROMA: 2.4% (Panel 1) to 0.1% (Panel 6) |
| Reproducibility (Site-to-Site) | Acceptable %CV for ROMA scores across different sites. | Overall Total %CV for Premenopausal ROMA: 8.1% (Panel 1) to 0.1% (Panel 6)
Overall Total %CV for Postmenopausal ROMA: 5.2% (Panel 1) to 0.2% (Panel 6) |
| Analytical Specificity| Minimal interference from common endogenous interferents. | Mean Percent (%) Difference for all tested interferents (Free Bilirubin, Conjugated Bilirubin, Triglycerides, Hemoglobin, Total Protein, Immunoglobulin G, Biotin, HAMA, Rheumatoid Factor) was within a range of -2% to +1% for both pre- and post-menopausal ROMA scores. |

2. Sample Sizes and Data Provenance for Test Set (Clinical Study)

Sample Size: A total of 450 women were evaluable in the clinical study test set.
- Premenopausal: 244
- Postmenopausal: 206
Data Provenance: The study was described as a prospective, multi-center, blinded clinical trial. The specific country of origin is not mentioned in the provided text.

3. Number of Experts and Qualifications for Ground Truth (Clinical Study)

Number of Experts: Not explicitly stated as a count of individual experts.
Qualifications: "An initial cancer risk assessment (ICRA) was completed by a non-gynecological oncologist". The specific years of experience or board certifications are not provided.
Ground Truth for Clinical Study: Histopathology reports collected after surgery were the definitive ground truth for malignancy.

4. Adjudication Method for the Test Set

The text describes the clinical trial as "blinded," implying that those interpreting the Lumipulse G ROMA results were blinded to the initial cancer risk assessment (ICRA) and histopathology, and vice-versa for the ICRA.
Adjudication Method: Not explicitly detailed beyond the "blinded" nature and the use of histopathology as the definitive truth. There is no mention of a 2+1 or 3+1 type of expert consensus for the initial clinical assessment.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

The study evaluated the adjunctive use of Lumipulse G ROMA with Initial Cancer Risk Assessment (ICRA). This is a form of comparative effectiveness study involving human readers (non-gynecological oncologists) with and without the device.
Effect Size (Improvement with AI vs. without AI assistance):
- The study showed a statistically significant improvement in the Negative Predictive Value (NPV) when Lumipulse G ROMA was used adjunctively with ICRA.
- NPV for classifying benign patients into the low likelihood group increased from 93.1% (ICRA alone) to 96.1% (Adjunctive). This represents a 3% absolute increase in NPV.
- Other metrics for adjunctive use compared to ICRA alone:
  - Sensitivity increased from 72.6% to 88.1%
  - Specificity decreased from 84.2% to 67.5%
  - PPV decreased from 51.3% to 38.3%

6. Standalone (Algorithm Only) Performance

Yes, a standalone performance was done. The sections titled "Use of Lumipulse G ROMA for stratification into low likelihood and high likelihood groups for finding malignancy on surgery" and "The performance of Lumipulse G ROMA for stratification into low likelihood and high likelihood groups for premenopausal and postmenopausal women with epithelial ovarian cancer (EOC) only" directly present the performance of the Lumipulse G ROMA algorithm in isolation.
The results are presented for premenopausal and postmenopausal women separately, and for all cancer and LMP tumors combined.

7. Type of Ground Truth Used

For the clinical study, the definitive ground truth was histopathology reports collected after surgery. This is considered a high-quality, objective ground truth.

8. Sample Size for the Training Set

The submission does not explicitly mention a separate training set for the Lumipulse G ROMA algorithm itself.
The ROMA algorithm's equation (Predictive Index for premenopausal and postmenopausal women) and clinical cut-offs (1.31 and 2.77) are identical to the predicate device (ROMA (HE4 EIA + ARCHITECT CA 125 II) K103358). This suggests that the algorithm itself was likely developed and validated previously, and this submission focuses on the performance of the Lumipulse G HE4 and Lumipulse G CA125II assays within the established ROMA framework. The provided study serves as a clinical validation for the proposed device using these specific assays.

9. How the Ground Truth for the Training Set Was Established

Given that the algorithm and its cut-offs appear to be directly adopted from the predicate device and its previous development, the specific details of how the original training set for the ROMA algorithm's ground truth was established are not provided in this document.
For the assays themselves (Lumipulse G HE4 and CA125II), which are previously cleared devices, their calibration and standardization would have been established against reference materials, but this isn't a "training set" for the algorithm.

Ask a Question

Ask a specific question about this device

K Number

K151502

Device Name

ARCHITECT ROMA

Manufacturer

Fujirebio Diagnostics, Inc.

Date Cleared

2016-04-28

(329 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

k093957,k042731

Intended Use

ARCHITECT Risk of Ovarian Malignancy Algorithm (ROMA) is a qualitative serum test that combines the results of ARCHITECT HE4. ARCHITECT CA 125 II and menopausal status into a numerical score.

ARCHITECT ROMA is intended to aid in assessing whether a premenopausal woman who presents with an ovarian adnexal mass is at high or low likelihood of finding malignancy on surgery. ARCHITECT ROMA is indicated for women who meet the following criteria: over age 18, ovarian adnexal mass present for which surgery is planned, and not yet referred to an oncologist. ARCHITECT ROMA must be interpreted in conjunction with an independent clinical and radiological assessment. The test is not intended as a screening or stand-alone diagnostic assay.

PRECAUTION: ARCHITECT ROMA should not be used without an independent clinical fradiological evaluation and is not intended to be a screening test or to determine whether a patient should proceed to surgery. Incorrect use of ARCHITECT ROMA carries the risk of unnecessary testing, surgery, and/or delayed diagnosis.

Device Description

ARCHITECT ROMA is a qualitative serum test that combines the results of 2 analytes, HE4 (ARCHITECT HE4) and CA125 (ARCHITECT CA 125 II) and menopausal status into a numerical score between 0.0 and 10.0. The premenopausal status must be based on ovarian function determined with information available from clinical evaluation and medical history.

The test system consists of the ARCHITECT HE4 assay, the ARCHITECT CA 125 Il assay and the ARCHITECT i2000SR. The ARCHITECT i2000SR is capable of calculating the ROMA score. The immunoassays are performed according to the directions detailed in each product insert.

AI/ML Overview

This document describes the ARCHITECT ROMA, a qualitative serum test that combines results of ARCHITECT HE4, ARCHITECT CA 125 II, and menopausal status to provide a numerical score. It aims to assess the likelihood of malignancy in women with an ovarian adnexal mass for whom surgery is planned.

Here's an analysis of the acceptance criteria and the study that proves the device meets them:

1. Table of Acceptance Criteria and Reported Device Performance

The acceptance criteria are not explicitly laid out as distinct "criteria" with corresponding "acceptance values" in the provided document. Instead, the document presents clinical performance metrics, specifically sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for both standalone ROMA and its adjunctive use with Initial Cancer Risk Assessment (ICRA). The tables below summarize the reported device performance for these key metrics.

Standalone ARCHITECT ROMA Performance for Stratification into Likelihood of Finding Malignancy on Surgery

(Based on N=238 Premenopausal and N=184 Postmenopausal women, identifying Epithelial Ovarian Cancer, EOC)

Metric	Premenopausal Performance (95% CI)	Postmenopausal Performance (95% CI)
Sensitivity	100.0% (70.1, 99.2)	92.3% (79.7, 97.2)
Specificity	86.5% (81.4, 90.3)	85.5% (78.9, 90.3)
PPV	22.5% (12.3, 37.4)	63.2% (50.2, 74.4)
NPV	100.0% (98.1, 99.9)	97.6% (93.3, 99.2)
Prevalence	3.8%	21.2%

Adjunctive Use of ARCHITECT ROMA with ICRA Performance

(Based on N=85 Malignancies and N=374 No Malignancy, covering EOC, LMP, non-epithelial ovarian cancer, other gynecologic, and non-gynecologic cancers)

Metric	ICRA Performance (95% CI)	ARCHITECT ROMA Performance (95% CI)	Adjunctive Performance (95% CI)
Sensitivity	72.9% (62.7, 81.2)	80.0% (70.3, 87.1)	87.1% (78.3, 92.6)
Specificity	84.2% (80.2, 87.6)	86.1% (82.2, 89.2)	75.7% (71.1, 79.7)
PPV	51.2% (42.4, 60.0)	56.7% (47.7, 65.2)	44.8% (37.5, 52.5)
NPV	93.2% (90.0, 95.4)	95.0% (92.1, 96.8)	96.3% (93.4, 97.9)
Prevalence	18.5%	18.5%	18.5%

2. Sample Size Used for the Test Set and Data Provenance

Sample Size for Clinical Study (Test Set):
- Total: 459 women
- Premenopausal: 250
- Postmenopausal: 209
- For the performance tables (sensitivity, specificity, etc.):
  - Premenopausal: 238
  - Postmenopausal: 184
  - Adjunctive use: 459 (85 malignancies, 374 no malignancy)
Data Provenance: The study was a "prospective, multi-center, blinded clinical trial." The country of origin is not explicitly stated, but the submission is to the US FDA, implying data collected for a US market, potentially including US sites.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

The document states that "The corresponding histopathology reports were collected after surgery" for each patient to determine the truth (benign or malignant). It does not specify the number of individual experts (e.g., pathologists) involved in establishing the histopathological ground truth or their specific qualifications (e.g., years of experience). However, histopathological classification is a standard and well-established method for confirming benign or malignant status in medical practice.

4. Adjudication Method for the Test Set

The document does not describe a formal adjudication method for discrepancies in ground truth or interpretation of results. The ground truth was established by "histopathology reports," which typically represent a definitive diagnosis.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done, and the Effect Size of How Much Human Readers Improve with AI vs. Without AI Assistance

No, a multi-reader, multi-case (MRMC) comparative effectiveness study focusing on human reader improvement with AI assistance was not explicitly detailed in this document. The study evaluated the adjunctive use of ARCHITECT ROMA with ICRA (Initial Cancer Risk Assessment, which is typically a clinical and radiological assessment performed by human experts).

The document states: "Adjunctive use ARCHITECT ROMA with ICRA produced a statistically significant improvement in the negative predictive value (NPV). The NPV for correctly classifying benign patients into the low likelihood group increased from 93.2 to 96.3%, making the adjunctive use of ARCHITECT ROMA with ICRA effective in ruling out malignancy."

While this shows an improved performance metric (NPV) when ARCHITECT ROMA is used in conjunction with ICRA, it does not quantify human reader performance with and without AI assistance in an MRMC setting. It rather compares the performance of ICRA alone versus ICRA plus ROMA.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done

Yes, a standalone performance evaluation of ARCHITECT ROMA was conducted. The table "Stratification of 422 women... into low likelihood and high likelihood groups for finding malignancy on surgery using ARCHITECT ROMA" directly reports the sensitivity, specificity, PPV, and NPV of the ARCHITECT ROMA algorithm alone, without explicit human-in-the-loop assessment beyond the initial patient selection criteria.

7. The Type of Ground Truth Used

The primary ground truth used for the clinical study was histopathology reports collected after surgery, classifying ovarian adnexal masses as benign or malignant.

8. The Sample Size for the Training Set

The document does not explicitly state the sample size for the training set used to develop the ARCHITECT ROMA algorithm itself. It mentions that ARCHITECT HE4 and ARCHITECT CA 125 II are "previously cleared devices" and their analytical performance was validated separately. The clinical study described in this document serves as a validation or test set for the combined ARCHITECT ROMA algorithm, rather than a training set for its development. The algorithm's equations are provided, indicating it was developed and fixed prior to this validation study.

9. How the Ground Truth for the Training Set Was Established

Since the clinical study described is a validation study and not a development study for the algorithm, the document does not provide details on how the ground truth for an initial training set (if any, for the specific ROMA algorithm) was established. It's plausible the algorithm was developed using a separate dataset, or by leveraging data from the original ROMA™ (HE4 EIA + ARCHITECT CA 125 II) predicate device (K103358), which also used "histopathological findings" as ground truth in its own studies.

Ask a Question

Ask a specific question about this device

K Number

K150588

Device Name

OVA1 Next Generation

Manufacturer

VERMILLION, INC.

Date Cleared

2016-03-18

(375 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K081754

Intended Use

The OVA1 Next Generation test is a qualitative serum test that combines the results of five immunoassays into a single numeric result. It is indicated for women who meet the following criteria: over age 18, ovarian adnexal mass present for which surgery is planned, and not yet referred to an oncologist.

The OVA1 Next Generation test is an aid to further assess the likelihood that malignancy is present when the physician's independent clinical and radiological evaluation does not indicate malignancy. The test is not intended as a screening or stand-alone diagnostic assay.

Device Description

The OVA1 Next Generation (NG) test consists of software, instruments, assays and reagents. The software incorporates the results of serum biomarker concentrations from five immunoassays to calculate a single, unitless numeric result indicating a low or high risk of ovarian malignancy.

The assays used to generate the numeric result (OVA1 NG test result) are APO, CA 125 II, FSH, HE4 and TRF.

Biomarker values are determined using assays on the Roche cobas® 6000 system, which is a fully automated, software-controlled system for clinical chemistry and immunoassay analysis. The biomarker assays are run according to the manufacturer's instructions as detailed in the package insert for each reagent.

The OVA1 NG software (OvaCalc v4.0.0) contains a proprietary algorithm that utilizes the results (values) from the five biomarker assays, (APO, CA 125 II, FSH, HE4 and TRF). The assay values from the cobas 6000 system are either imported into OvaCalc through a .csv file or manually entered into the OvaCalc user interface to generate an OVA1 NG test result between 0.0 and 10.0. A low- or high-risk result is then determined by comparing the software-generated risk score to a single cutoff (low-risk result

AI/ML Overview

Here's an analysis of the acceptance criteria and study findings for the OVA1 Next Generation device, based on the provided text:

1. Table of Acceptance Criteria and Reported Device Performance

The document does not explicitly state pre-defined acceptance criteria for the OVA1 Next Generation device in terms of specific performance thresholds (e.g., "Sensitivity must be >X%"). Instead, the study focuses on demonstrating substantial equivalence to a predicate device (the original OVA1) and showing improvements in certain metrics.

However, based on the clinical performance evaluation and comparisons to the predicate, we can infer the primary goal was to at least maintain sensitivity while significantly improving specificity and positive predictive value. The "clinically small" and "clinically significant" definitions also act as implicit criteria for comparison with the predicate.

Here's a table summarizing the reported comparative performance:

Metric (vs. Predicate OVA1)	OVA1 Next Generation Performance (with PA)	OVA1 Next Generation Performance (Standalone)	Goal/Inferred Acceptance Criterion
Specificity (Overall)	Improved by ~14% (64.8% vs 50.9%)	Improved by ~16% (69.1% vs 53.6%)	Significantly improved specificity while maintaining sensitivity
Specificity (Postmenopausal)	Improved ~23% (60.9% vs 37.8%)	Improved ~24% (65.4% vs 41.0%)	Significantly improved specificity
Specificity (Premenopausal)	Improved ~8% (67.3% vs 59.2%)	Improved ~10% (71.4% vs 61.6%)	Significantly improved specificity
Sensitivity (Overall)	Differences "clinically small" (~ -2.17%) (93.5% vs 95.7%)	Differences "clinically small" (~ -1.09%) (91.3% vs 92.4%)	Maintain similar sensitivity
Sensitivity (Postmenopausal)	Differences "clinically small" (~ -1.64%) (95.1% vs 96.7%)	Identical (91.8% vs 91.8%)	Maintain similar sensitivity
Sensitivity (Premenopausal)	Differences "clinically small" (~ -3.23%) (90.3% vs 93.5%)	Differences "clinically small" (~ -3.23%) (90.3% vs 93.5%)	Maintain similar sensitivity
Positive Predictive Value (PPV) (Overall)	N/A (Only Standalone data provided for PPV comparison)	Improved by 9% (40.4% vs 31.4%)	Significantly improved PPV
Negative Predictive Value (NPV) (Overall)	Differences "substantially equivalent" (~0.35%) (97.7% vs 98.1%)	Differences "substantially equivalent" (~0.35%) (97.2% vs 96.8%)	Maintain similar NPV
Precision (%CV)	1.54% (Overall)	1.54% (Overall)	Better than or equivalent to predicate (Predicate was 4.09%)
Reproducibility (%CV)	1.63% (Overall)	1.63% (Overall)	Better than or equivalent to predicate (Predicate was 2.80%)

2. Sample Size Used for the Test Set and Data Provenance

Test Set Sample Size:
- Clinical Performance Evaluation: 493 evaluable subjects (from an initial 519 enrolled). Split into 276 premenopausal and 217 postmenopausal.
- Clinical Specificity - Healthy Women Study: 152 healthy women (68 premenopausal, 84 postmenopausal).
- Clinical Specificity - Other Cancers and Disease States: 401 samples from women with various non-ovarian cancers and benign conditions.
- Method Comparison (Archived Samples): 133 samples (28 primary ovarian malignancies, 105 benign ovarian conditions) for a direct comparison with the predicate.
Data Provenance: The primary clinical study used a banked sample set from a prospective, multi-site pivotal study of OVA1 – the OVA500 Study. The archived samples were used to conduct a side-by-side clinical validation for Substantial Equivalence purposes. The method comparison study (Table 10) also used archived samples collected from selected larger prospective studies, tested within one year of collection. The document does not specify the country of origin for the OVA500 study, but it is implied to be a US study given the FDA submission.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications

The ground truth for the clinical performance evaluation was established by postoperative pathology diagnosis, which was recorded at each enrolling site and independently reviewed.
The document does not specify the number of experts or their explicit qualifications (e.g., "radiologist with 10 years of experience") for this pathology review. However, "postoperative pathology diagnosis" generally implies review by trained pathologists.

4. Adjudication Method for the Test Set

The document explicitly states that postoperative pathology diagnosis was "independently reviewed." However, the exact adjudication method (e.g., 2+1, 3+1, none) for discrepancies in pathology results is not detailed in the provided text.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was done

No, a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was not done in the traditional sense of evaluating multiple human readers' performance with and without AI assistance.
The study compared the performance of:

The OVA1 Next Generation device itself (standalone).
The original OVA1 device (predicate) itself (standalone).
A "dual assessment" combining Physician Assessment (PA) OR the device result (OVA1 Next Generation or original OVA1).

While Physician Assessment (PA) involves human readers (clinicians), the study design evaluates the addition of the device's score to the physician's assessment, rather than directly measuring improvement of human readers performing a task with AI assistance vs without.

Effect Size of Human Readers Improvement with AI vs. Without AI Assistance: Not directly measurable from the provided data in the context of an MRMC study. The "with PA" results show the combined performance, where the device acts as an "aid" to PA.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) was done

Yes, a standalone (algorithm only) performance evaluation was done.
Table 6 and Table 7 explicitly present "Standalone Specificity" and "Standalone Sensitivity" for the OVA1 Next Generation device without Physician Assessment (PA) in the risk calculation.

7. The Type of Ground Truth Used

The primary ground truth used for the clinical performance evaluation was pathology (postoperative pathology diagnosis), which was independently reviewed.

8. The Sample Size for the Training Set

The document does not provide the sample size for the training set for the OVA1 Next Generation algorithm. It describes the device's algorithm and its inputs but focuses on the validation study using a banked sample set without detailing the original training cohort.

9. How the Ground Truth for the Training Set Was Established

Since the sample size for the training set is not provided, the method for establishing its ground truth is also not detailed. It is implied that the algorithm was developed (and likely trained/tuned) using similar pathology-confirmed data, but specifics are absent in this document.

Ask a Question

Ask a specific question about this device

K Number

K103358

Device Name

ROMA (HE4 EIA + ARCHITECT CA 125 II)

Manufacturer

FUJIREBIO DIAGNOSTICS, INC.

Date Cleared

2011-09-01

(289 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K081754

Intended Use

For In Vitro Diagnostic Use Only.

The Risk of Ovarian Malignancy Algorithm (ROMA™) is a qualitative serum test that combines the results of HE4 EIA, ARCHITECT CA 125 II™ and menopausal status into a numerical score.

ROMA is intended to aid in assessing whether a premenopausal or postmenopausal woman who presents with an ovarian adnexal mass is at high or low likelihood of finding malignancy on surgery. ROMA is indicated for women who meet the following criteria: over age 18; ovarian adnexal mass present for which surgery is planned, and not yet referred to an oncologist. ROMA must be interpreted in conjunction with an independent clinical and radiological assessment. The test is not intended as a screening or stand-alone diagnostic assay.

PRECAUTION: ROMA (HE4 EIA + ARCHITECT CA 125 II) should not be used without an independent clinical fradiological evaluation and is not intended to be a screening test or to determine whether a patient should proceed to surgery. Incorrect use of ROMA (HE4 EIA + ARCHITECT CA 125 II) carries the risk of unnecessary testing, surgery, and/or delayed diagnosis.

Device Description

The Risk of Ovarian Malignancy Algorithm (ROMATM) is a qualitative serum test in the form of a mathematical function combining the results of HE4 EIA, ARCHITECT CA 125 II™ and menopausal status into a numerical score.

ROMA was developed in a training set using separate logistic regression equations for premenopausal and postmenopausal women:

Premenopausal woman: Predictive Index (PI) = -12.0 + 2.38LN[HE4] + 0.0626LN[CA 125] Postmenopausal woman: Predictive Index (PI) = -8.09 + 1.04LNIHE4] + 0.732LNJCA 125J ROMA = exp(PI) / [1 + exp(PI)] *10

ROMA is used to stratify women into likelihood groups for finding cancer on surgery. In order to provide a specificity level of 75%, a cut point of ≥ 1.31 was used for premenopausal women and ≥ 2.77 was used for postmenopausal women who present with an ovarian adnexal mass. Women with ROMA results above these cut points is at high likelihood of finding malignancy on surgery.

The test system consists of the assays, reagents, software and instrument used to obtain the ROMA result. The ROMA instructions for use are provided with the HE4 EIA Kit. The HE4 EIA and ARCHITECT CA 125 II are performed according to the manufacturers' directions detailed in each product insert. The immunoassays used in ROMA are:

HE4 EIA: The HE4 EIA is an enzyme immunometric assay for the quantitative determination of HE4 in human serum.
ARCHITECT CA 125 II: The ARCHITECT CA 125 II assay is a Chemiluminescent Microparticle Immunoassay (CMIA) for the quantitative determination of OC 125 defined antigen in human serum and plasma on the ARCHITECT i System.

AI/ML Overview

1. Table of Acceptance Criteria and Reported Device Performance

The document does not explicitly state "acceptance criteria" percentages. However, it does present performance metrics as a demonstration of the device's effectiveness at specific cut-points. The key performance metrics are Sensitivity and Specificity for Epithelial Ovarian Cancer (EOC) stratification, and Negative Predictive Value (NPV) for the adjunctive use of ROMA with Initial Cancer Risk Assessment (ICRA).

Here's a table based on the provided performance data for Epithelial Ovarian Cancer (EOC) only and Adjunctive Use (All cancers & LMP Tumors):

ROMA Performance for Stratification of Epithelial Ovarian Cancer (EOC) Only

Metric	Premenopausal (95% CI)	Postmenopausal (95% CI)
Sensitivity	100.0% (66.4% - 100%)	92.3% (79.1% - 98.4%)
Specificity	74.5% (68.3% - 80.2%)	76.8% (69.3% - 83.2%)

Adjunctive Use of ROMA with ICRA (All Malignancies)

Metric	ICRA Only (95% CI)	ROMA Only (95% CI)	Adjunctive (ROMA + ICRA) (95% CI)
Sensitivity	73.3% (63.1% - 81.4%)	82.6% (73.2% - 89.1%)	88.4% (79.9% - 93.5%)
Specificity	84.3% (80.2% - 87.6%)	75.5% (70.9% - 79.5%)	67.2% (62.3% - 71.8%)
Negative Predictive Value	93.2% (90.0% - 95.4%)	95.0% (91.9% - 96.9%)	96.2% (93.1% - 97.9%)

The document specifies ROMA cut-points to achieve a specificity level of 75%:

Premenopausal women: ROMA value ≥ 1.31 for high likelihood of finding malignancy (specificity 74.5% reported).
Postmenopausal women: ROMA value ≥ 2.77 for high likelihood of finding malignancy (specificity 76.8% reported).

2. Sample Size and Data Provenance

Test set sample size: 461 women were evaluable in the study. This was broken down into 240 premenopausal women and 221 postmenopausal women.
Data provenance: The data was obtained from a "prospective, multi-center, blinded clinical trial". The country of origin is not explicitly stated, but clinical trials for FDA submissions typically involve sites in the US or internationally recognized centers. The study is described as prospective.

3. Number of Experts and Qualifications for Ground Truth

The document mentions an "Initial cancer risk assessment (ICRA) was completed by a non-gynecological oncologist". It does not specify the number of non-gynecological oncologists, their years of experience, or their exact qualifications beyond their specialty.

4. Adjudication Method for the Test Set

The primary ground truth for malignancy was based on histopathology reports collected after surgery. There is no explicit mention of an adjudication method among multiple readers for the histopathology, implying a single official report was used.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

No MRMC study is explicitly presented in the provided text. The document compares ROMA's performance to an "Initial Cancer Risk Assessment (ICRA)" but does not describe ICRA as a multi-reader assessment or compare human readers with and without AI assistance. It evaluates the "adjunctive use of ROMA with ICRA" as a combined approach versus ICRA alone and ROMA alone.

Effect size of human readers improving with AI vs. without AI: Not directly applicable as an MRMC study was not described. However, the NPV for classifying benign patients into the low likelihood group increased from 93.2% (ICRA only) to 96.2% (Adjunctive use with ROMA + ICRA), implying an improvement in ruling out cancer with the addition of ROMA. This is an improvement of 3.0 percentage points in NPV.

6. Standalone Performance

Yes, standalone performance for ROMA was done and is reported.

The "Use of ROMA for stratification into low likelihood and high likelihood groups for finding malignancy on surgery" section provides standalone performance metrics for ROMA (Sensitivity, Specificity, etc.) for premenopausal and postmenopausal women.
The "Adjunctive use of ROMA with Initial Cancer Risk Assessment (ICRA)" section also includes "ROMA" column which can be interpreted as standalone ROMA performance for comparison.

7. Type of Ground Truth Used

The ground truth used was histopathology reports collected after surgery. This is considered a definitive diagnostic method for determining malignancy.

8. Sample Size for the Training Set

ROMA's development is mentioned to have used a "training set" for separate logistic regression equations. However, the sample size for this training set is not provided in the document.

9. How the Ground Truth for the Training Set Was Established

The document states ROMA "was developed in a training set using separate logistic regression equations". While it implies the training set would also have used histopathology as ground truth, similar to the test set, the method for establishing ground truth for the training set is not explicitly detailed.

Ask a Question

Ask a specific question about this device

K Number

DEN090004

Device Name

OVA1 TEST

Manufacturer

VERMILLION

Date Cleared

2009-09-11

(51 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

N/A

Intended Use

The OVA1™ Test is a qualitative serum test that combines the results of five immunoassays into a single numerical score. It is indicated for women who meet the following criteria: over age 18; ovarian adnexal mass present for which surgery is planned, and not yet referred to an oncologist. The OVA 1 Test is an aid to further assess the likelihood that malignancy is present when the physician's independent clinical and radiological evaluation does not indicate malignancy. The test is not intended as a screening or stand-alone diagnostic assay.

PRECAUTION: The OVA1™ Test should not be used without an independent clinical/radiological evaluation and is not intended to be a screening test or to determine whether a patient should proceed to surgery. Incorrect use of the OVA1™ Test carries the risk of unnecessary testing, surgery, and/or delayed diagnosis.

Device Description

The OVA1™ Test uses OvaCalc Software to incorporate the values for 5 analytes from separately run immunoassays (described below) into a single numerical score between 0.0 and 10.0.

The cleared test system consists of the software, instruments, assays and reagents used to obtain the OVA1™ Test result. The immunoassays and reagents are sold separately from the OvaCalc Software. Users are instructed to use only those lots identified by Vermillion. The immunoassays are performed according to the manufacturers' directions detailed in each product insert. The analytes and corresponding tests and calibrators used in the OVA1™ Test are:

Analyte	Device (Assay and Calibrator)	Instrument
CA 125	Elecsys CA 125 II
CA125 II CalSet	Roche Elecsys 2010
Prealbumin	N Antisera to Human Prealbumin and
Retinal-binding Protein
N Protein Standard SL (human)	Siemens BN II
Apolipoprotein
A-1	N-Antisera to Human Apolipoprotein A-1
and Apolipoprotein B
N Apolipoprotein Standard Serum (human)	Siemens BN II
β2-microglobulin	Human Beta-2 Microglobulin Latex
Enhanced Nephelometric Kit (Binding Site)	Siemens BN II
Transferrin	N Antisera to Human Transferrin and
Haptoglobin
N Protein Standard SL (human)	Siemens BN II

The user enters results of the five analytes manually into an Excel spreadsheet together with the headers needed by OvaCalc Software. There is no physical or electronic connection between the immunoassay devices and the OvaCalc Software. Using an algorithm and the values of these 5 analytes, the OvaCalc Software generates a single unit-less numerical score from 0.0 to 10.0.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study detailed in the provided document:

1. Table of Acceptance Criteria and Reported Device Performance

The document does not explicitly state pre-defined acceptance criteria in terms of target performance metrics (e.g., minimum sensitivity or specificity values). Instead, it presents the performance characteristics observed in the clinical validation study. The closest thing to acceptance criteria for the clinical performance seems to be the demonstration that the "True Positive Rate (TPR) exceeded the False Positive Rate (FPR)" with statistical significance. The other criteria relate to analytical performance, such as precision and stability, which are met through demonstrated performance.

Criterion Type	Acceptance Criteria (Implicit/Explicit)	Reported Device Performance
Analytical Performance
Precision (Total %CV)	Acceptable within-run, between-run, between-day, between-operator, and between-site %CV. (Explicitly stated 250 RU/mL caused significant interference; specimens with RF > 250 RU/mL are not appropriate for the test.
Clinical Performance
Statistical Informativeness	True Positive Rate (TPR) must exceed False Positive Rate (FPR) with statistical significance for combined data, pre-menopausal subjects, and post-menopausal subjects.	All combined data: TPR (87.5%) > FPR (49.2%); difference of 38.3% (95% CI: 26.5% to 47.8%) statistically significant. Pre-menopausal: TPR (80.8%) > FPR (43.2%); difference of 37.6% (95% CI: 16.7% to 52.2%) statistically significant. Post-menopausal: TPR (91.3%) > FPR (58.2%); difference of 33.1% (95% CI: 17.3% to 46.1%) statistically significant.
Adjunctive Information Value (Dual Assessment for Non-GO)	Dual assessment (Physician's pre-surgical assessment + OVA1™ Test) should provide additional information compared to physician's assessment alone, specifically by increasing sensitivity for malignancy and maintaining (or improving) NPV. The benefit of detecting additional true positive cases should outweigh the additional false positives for the intended use population.	Sensitivity: Increased from 72.2% (single assessment) to 91.7% (dual assessment). Specificity: Decreased from 82.7% to 41.6%. PPV: Decreased from 60.5% to 36.5%. NPV: Increased from 89.1% to 93.2%. (95% CI for the 4.1% increase in NPV was -0.5% to 8.7%, borderline statistical significance). Conclusion notes "sufficient benefit".
Adjunctive Information Value (Dual Assessment for GO)	Corroborative results to non-GO analysis regarding additional information provided by dual assessment.	Sensitivity: Increased from 77.5% (single assessment) to 98.9% (dual assessment). Specificity: Decreased from 74.7% to 25.9%. PPV: Decreased from 63.3% to 42.9%. NPV: Increased from 85.5% to 97.6% (95% CI for the 12.1% increase in NPV was 5.7% to 18.6%, statistically significant). Conclusion notes "corroborative, but not dispositive" for intended use.

2. Sample Sizes and Data Provenance

Test Set (Clinical Validation Study):

Total Enrolled: 743 patients.
Training Set (from enrollment): 146 subjects were set aside for training, with 21 not evaluable, leaving 125 for training.
Final Evaluable Test Set: 516 subjects/samples (after excluding training set and those with missing info/lack of sample).
- Non-GO Physician Evaluated Subset: 269 patients.
- GO Physician Evaluated Subset: 247 patients.
Data Provenance: Prospective, multicenter, double-blind clinical study. Samples collected from 27 demographically mixed subject enrollment sites in the US (implied by typical FDA submission context and "demographically mixed" implies US diversity).

Training Set:

Training Set 1: 284 pre-operative serum samples from the University of Kentucky.
- Complete laboratory data for 274 samples (109 malignant, 175 benign).
Training Set 2: A randomly selected subset of 146 pre-operative serum samples collected under a clinical trial specimen repository.
- 21 not evaluable, leaving 125 samples (89 benign, 10 LMPs, 19 EOCs, 1 primary, 3 non-primary ovarian cancers, 3 other malignancies).

3. Number of Experts and Qualifications for Test Set Ground Truth

The document does not specify a "number of experts" used to establish ground truth in the sense of independent review of imaging or clinical data. Instead, the ground truth for malignancy status in the clinical validation study was established through histopathology results from tissue samples obtained during surgical intervention.

The clinical assessments made by physicians (non-GO and GO) were used for comparative purposes against the device and as part of the "dual assessment" scenario, but these were not explicitly designated as "expert ground truth" for the device's diagnostic accuracy. The ultimate ground truth for classification of benign vs. malignant was pathology.

4. Adjudication Method for the Test Set

The document does not describe an "adjudication method" in the typical sense of multiple expert readers reviewing cases and resolving disagreements.

The OVA1™ Test results were generated by an algorithm from five immunoassay values.
The clinical pre-surgical assessments were made by individual physicians (non-GO or GO).
The ground truth outcome was histopathology.

The comparison was between these individual components or combinations, not between adjudicated expert readings.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

No, a multi-reader multi-case (MRMC) comparative effectiveness study, as typically understood for imaging devices where human readers interpret cases with and without AI assistance, was not performed.

This study compared:

The OVA1™ Test alone.
The physician's (non-GO or GO) pre-surgical assessment alone.
A "Dual Assessment" (physician's assessment OR OVA1™ Test positive).

The "effect size of how much human readers improve with AI vs without AI assistance" is not reported in the traditional MRMC sense. Instead, the document quantifies the change in clinical performance metrics (Sensitivity, Specificity, PPV, NPV) when the OVA1™ Test is combined with the physician's pre-surgical assessment, rather than directly measuring physician improvement while using AI.

For non-GO physicians, when using dual assessment:

Sensitivity for malignancy increased from 72.2% (single assessment) to 91.7% (dual assessment).
NPV increased from 89.1% to 93.2%.

For GO physicians, when using dual assessment:

Sensitivity for malignancy increased from 77.5% (single assessment) to 98.9% (dual assessment).
NPV increased from 85.5% to 97.6%.

6. Standalone (Algorithm Only) Performance

Yes, a standalone performance assessment (algorithm only without human-in-the-loop performance) was done for the OVA1™ Test.

The "Performance Characteristics of the OVA1™ Test Alone" section directly presents its sensitivity, specificity, NPV, and PPV compared to histopathology for patients evaluated by non-GO physicians (and similarly for GO physicians, though these were deemed less relevant for the intended use population).

Standalone Performance (Non-GO Physician Population):

Sensitivity: 87.5% (63/72)
Specificity: 50.8% (100/197)
NPV: 91.7% (100/109)
PPV: 39.4% (63/160)

7. Type of Ground Truth Used (for Clinical Studies)

The primary ground truth used for establishing clinical performance was histopathology results from tissue samples obtained during surgical intervention. Malignancy status was determined based on these reports.

8. Sample Size for the Training Set

The algorithm was derived using two independent training datasets:

Training Set 1: 284 pre-operative serum samples (274 evaluable: 109 malignant, 175 benign).
Training Set 2: 146 pre-operative serum samples (125 evaluable: 89 benign, 10 LMPs, 19 EOCs, 1 primary and 3 non-primary ovarian cancers, 3 other malignancies).

9. How the Ground Truth for the Training Set Was Established

The document states that the training sets consisted of "pre-operative serum samples" which were classified into categories like "benign diseases," "ovarian tumors of low malignant potential (LMP)," "epithelial ovarian cancers," etc. This classification would universally be based on histopathology obtained from the surgical specimens after mass removal.

Ask a Question

Ask a specific question about this device

Page 1 of 1