Search Results

AESKUSLIDES® ANCA is an indirect immunofluorescence assay utilizing human neutrophil granulocyte coated slides, fixed with Ethanol or Formalin, as a substrate for the qualitative and semi-quantitative determination of anti-neutrophil cytoplasmic autoantibodies (ANCA) in human serum by manual microscopy or with the HELIOS® AUTOMATED IFA SYSTEM.

This in vitro diagnostic assay is used as an aid for the diagnosis of ANCA-associated vasculitides (AAV) in conjunction with other clinical and laboratory findings. All suggested results obtained with the HELIOS AUTOMATED IFA SYSTEM must be confirmed by trained personnel.

Device Description

AESKUSLIDES ANCA is an indirect immunofluorescence assay utilizing human neutrophil granulocyte coated slides, fixed with Ethanol or Formalin, as a substrate for the qualitative and semi-quantitative determination of anti-neutrophil cytoplasmic autoantibodies (ANCA) in human serum by manual microscopy or with the HELIOS® AUTOMATED IFA SYSTEM.

This in vitro diagnostic assay is used as an aid for the diagnosis of ANCA-associated vasculitides (AAV) in coniunction with other clinical and laboratory findings. All suggested results obtained with the HELIOS AUTOMATED IFA SYSTEM must be confirmed by trained personnel.

Slides coated with human neutrophil granulocytes for autoantibody detection are fixated by two different methods: ethanol (EtOH) fixation or formalin fixation. Ethanol fixation allows cell components to move through the cells after the fixation process. Formalin fixation causes cellular components to cross-link (a movement of cellular components is abrogated and the patterns are distinct). By processing serum on both Ethanol and Formalin-fixed slides, the user can confirm if the pattern is C-, P-, or A-ANCA, according to the table below.

AI/ML Overview

This document outlines the acceptance criteria and supporting studies for the AESKUSLIDES® ANCA Ethanol and AESKUSLIDES® ANCA Formalin devices, which are indirect immunofluorescence assays for the qualitative and semi-quantitative determination of anti-neutrophil cytoplasmic autoantibodies (ANCA).

1. Table of Acceptance Criteria and Reported Device Performance

The provided document details various performance studies with corresponding acceptance criteria and results. Below is a summary of these:

Study/Metric	Acceptance Criteria	Reported Device Performance (Overall/Range)
Serum Stability (Freeze/Thaw)	Pos/Neg/Overall Agreement: > 85%All patterns found correctlyPattern Agreement: > 85%FI allowed to differ max +/- 1 from expectedFI Agreement: > 85%	Pos/Neg/Overall Agreement: 100% for both Ethanol and Formalin. All positive samples found positive, all negative found negative.Pattern Agreement: 100% for both Ethanol and Formalin. All patterns found as expected.FI Agreement: 100% for both Ethanol and Formalin. No deviations > +/-1 FI observed.
Long Term Serum Stability	Positive sera found positive, negative sera found negative throughout testing period.Correct patterns found.FI allowed to differ max +/- 1 from expected value at each time point.	All criteria fulfilled. Positive samples found positive, negative samples found negative. All patterns found correctly at all test time points.FI did not differ more than +/- 1 level from expected values.
Within-Lab Precision	Positive sera found positive, negative sera found negative.Reported FI allowed to differ max. +/- 1 level within study.HELIOS (% Positives of positive samples excl. borderline): > 80%Reader Confirmation (% Positives of positive samples excl. borderline): > 90%Manual (% Positives of positive samples excl. borderline): > 90%Same criteria for % Negatives of negative samples.	ANCA Ethanol:HELIOS: % Positive 86.7-100%, % Negative 83.3-86.7%.User Confirmation: % Positive 96.7-100%, % Negative 93.3-100%.Manual: % Positive 90-100%, % Negative 100%.ANCA Formalin:HELIOS: % Positive 100%, % Negative 90-96.7%.User Confirmation: % Positive 100%, % Negative 90-100%.Manual: % Positive 100%, % Negative 91.7-100%.All acceptance criteria met. FI within +/- 1 level. Pattern consistent >95% for B/C, >85% for A.
Between-Lab Precision	Positive, Negative, Overall, Pattern, FI Agreements:Method A: ≥ 70%Method B, C: ≥ 90% (borderline samples excluded in some calculations)	ANCA Ethanol:Overall-Between-Lab (Method A): 92.5% to 96.5%.Overall-Between-Lab (Method B): 98.5% to 100%.Overall-Between-Lab (Method C): 100%.ANCA Formalin:Overall-Between-Lab (Method A): 89.8% to 94.7%.Overall-Between-Lab (Method B): 95.8% to 98.6%.Overall-Between-Lab (Method C): 98.2% to 99.4%.All acceptance criteria met for all methods and sites, exceptions noted for Method A on Formalin (Site 3 Negative Agreement 61.7% but justified).
Between-Operator Agreement	All agreements > 90% for Method B, C (borderline samples excluded for Ethanol)HELIOS not applicable for Between-Operator	ANCA Ethanol:Overall-Between-Operator (Method B): 97.1% to 100%.Overall-Between-Operator (Method C): 100%.ANCA Formalin:Overall-Between-Operator (Method B): 93.5% to 99%.Overall-Between-Operator (Method C): 97.5% to 100%.All acceptance criteria met.
Single-Operator Agreement	All agreements > 90% for Method B, C (borderline samples excluded for Ethanol)HELIOS not applicable for Single-Operator	ANCA Ethanol:Overall-Single-Operator (Method B): 96.7% to 100%.Overall-Single-Operator (Method C): 100%.ANCA Formalin:Overall-Single-Operator (Method B): 92.7% to 99%.Overall-Single-Operator (Method C): 97% to 100%.All acceptance criteria met.
Instrument Precision	All agreements > 70% for Method A (HELIOS)	ANCA Ethanol:Overall-Instrument to Instrument (Method A): 91.3% to 99.2%.ANCA Formalin:Overall-Instrument to Instrument (Method A): 89% to 90.7%.All acceptance criteria met, except for ANCA Formalin negative agreement at site 3 (61.7% instead of 70%), which was addressed.
Lot to Lot Precision	Positive, Negative, Overall, Total Pattern, Single Pattern (C/P/A), FI Agreements: > 85%	ANCA Ethanol (combined readers):Positive agreement: 96.3% to 100%.Negative agreement: 100%.Overall agreement: 96.9% to 100%.Pattern agreement: 100%.FI agreement: 97.8% to 100%.ANCA Formalin (combined readers):Positive agreement: 99.5% to 100%.Negative agreement: 96.2% to 99.0%.Overall agreement: 98.4% to 99.7%.Pattern agreement: 100%.FI agreement: 97.5% to 99.1%.All acceptance criteria met.
Carry Over	Pos/Neg/Overall Agreement: All positive sera found positive, all negative found negative.All patterns found correctly.FI allowed to differ max +/- 1 from expected value.	All samples fulfilled criteria. No carry over was observed from well to well. All positive samples identified as positive, all negative as negative. All patterns identified correctly.
Time Extension Study	Pos/Neg/Overall Agreement: All positive sera found positive, all negative found negative.FI allowed to differ max +/- 1 from expected value.	All acceptance criteria fulfilled. All positive identified as positive, all negative as negative. All patterns found as expected. FI did not deviate more than +/-1 level.
Interfering Substances	Pos/Neg/Overall Agreement: > 90%.Pattern Agreement: > 90%.FI allowed to differ max +/- 1 from expected value; FI Agreement: > 90%.	ANCA Ethanol: Positive Agreement 97-100%; Negative Agreement 100%; Overall Agreement 98-100%; Pattern Agreement 97-100%; FI Agreement 98-100%.ANCA Formalin: Positive Agreement 100%; Negative Agreement 92-100%; Overall Agreement 98-100%; Pattern Agreement 100%; FI Agreement 100%.All acceptance criteria met. No interference detected.
Accelerated Stability Report	Positive, Negative, Overall, Total Pattern, FI Agreements: > 85%	ANCA Ethanol (both readers): Positive agreement 89.3-96.4%; Negative agreement 92.9-100%; Overall agreement 90.6-96.9%; Pattern agreement 86.7-95.9%; FI agreement 95.5-98.2%.ANCA Formalin (both readers, borderline excl.): Positive agreement 91.3-97.5%; Negative agreement 100%; Overall agreement 95-98.6%; Pattern agreement 91.3-97.5%; FI agreement 96.4-98.6%.All acceptance criteria met. Claims shelf life 24+3 months for Ethanol, 18 months for Formalin.
Real Time Stability Report	Positive, Negative, Overall, Pattern, FI Agreements: > 85%	ANCA Ethanol (both readers): Positive agreement 94.8-100%; Negative agreement 96.7-100%; Overall agreement 95-100%; Pattern agreement 94.8-100%; FI agreement 99.2-100%.ANCA Formalin (both readers): Positive agreement 88-90.7%; Negative agreement 97.8-100%; Overall agreement 91.7-94.2%; Pattern agreement 88-90.7%; FI agreement 99.6-100%.All acceptance criteria met for 3 months (ongoing study).
In Use Stability Report	Positive, Negative, Overall, Total Pattern, FI Agreements: > 85%	ANCA Ethanol (both readers): Positive agreement 98.6-100%; Negative agreement 100%; Overall agreement 98.8-100%; Pattern agreement 94.3-99.3%; FI agreement 100%.ANCA Formalin (both readers, borderline excl.): Positive agreement 97.5-98.8%; Negative agreement 100%; Overall agreement 98.3-99.2%; Pattern agreement 97.5-98.8%; FI agreement 98.3-100%.All acceptance criteria met for 6 weeks. Claims In Use Stability of 6 weeks.
Transport Stability Report	Positive, Negative, Overall, Total Pattern, FI Agreements: > 85%	Performed by Accelerated Stability Report data, demonstrating resistance to 37°C for at least 2 weeks. All criteria fulfilled.
Method Comparison (vs. Predicate)	Diagnostic sensitivity & specificity for ANCA Ethanol higher than predicate.Diagnostic sensitivity & specificity for ANCA Formalin comparable to predicate.Positive, Negative, Overall Agreements acceptable (67.1%, 88.3%, 79.3% for Ethanol; 80.5%, 91.8%, 89.9% for Formalin).	ANCA Ethanol: Sensitivity 48.5% (new) vs. 36.4% (predicate); Specificity 69.3% (new) vs. 55.2% (predicate). PPV 35.8% vs 22.2%, NPV 79.3% vs 71.1%. Agreements: Positive 67.1%, Negative 88.3%, Overall 79.3%.ANCA Formalin: Sensitivity 50.0% (new) vs. 37.9% (predicate); Specificity 90.7% (new) vs. 91.5% (predicate). PPV 65.3% vs 61.0%, NPV 83.7% vs 80.7%. Agreements: Positive 80.5%, Negative 91.8%, Overall 89.9%.All stated criteria met. New device comparable or better.
Method Comparison (A, B, C)	Positive, Negative, Overall Agreements between different methods: > 85%.Positive, Negative, Overall, Pattern Agreements (for clinical study): > 80%. (For Method A: > 70%).	ANCA Ethanol (Combined Readers):Method C vs B: Positive 86.2-90.6%, Negative 97.6-99.5%, Overall 91.7-95.5%, Pattern 82.5-89.2%.Method B vs A: Positive 79-89.6%, Negative 98.3-99%, Overall 92.2-96.8%, Pattern 81-85.6%.Method C vs A: Positive 70.7-82.4%, Negative 94.5-99.1%, Overall 89.7-93%, Pattern 77.8-81.8%.ANCA Formalin (Combined Readers):Method C vs B: Positive 86.6-89.1%, Negative 90.8-97.6%, Overall 90.3-93.3%, Pattern 82.1-87.5%.Method B vs A: Positive 79.8-99%, Negative 77.1-95.9%, Overall 83.8-95.9%, Pattern 76.1-90.8%.Method C vs A: Positive 73.1-95.6%, Negative 71.8-95%, Overall 78.2-90.5%, Pattern 69.7-79.6%.All acceptance criteria met, with one pattern agreement (C vs A Formalin) slightly below (69.7%) but addressed.
Endpoint Titer Comparison	Percentage of samples that differ max +/- 1 titer level: ≥ 90%Titer Agreement: ≥ 80%	ANCA Ethanol (All Readers Combined):Within-Lab (Method B): 95.1% within +/-1 titer level.Within-Lab (Method C): 95.4% within +/-1 titer level.Between-Lab (Method B): 82.3-93.3% Titer Agreement.Between-Lab (Method C): 87.0-98.3% Titer Agreement.ANCA Formalin (All Readers Combined):Within-Lab (Method B): 94.7% within +/-1 titer level.Within-Lab (Method C): 96.1% within +/-1 titer level.Between-Lab (Method B): 79-86.7% Titer Agreement.Between-Lab (Method C): 80.3-93.7% Titer Agreement.All acceptance criteria met, with one Method B site comparison (79%) for Formalin slightly below but addressed.
Expected Values/Reference Range	Low number of positive samples in healthy donors consistent with literature.	ANCA Ethanol: 6/150 (4%) and 3/150 (2%) positive results for Readers 1 and 2, respectively.ANCA Formalin: 6/150 (4%) and 4/150 (2.7%) positive results for Readers 1 and 2, respectively.Low numbers correlate well with literature.

Note on Borderline Samples: Several studies (e.g., Within-Lab Precision, Between-Lab Precision, Accelerated Stability, In Use Stability) explicitly mention the handling of "borderline" samples (very low positive samples that can be evaluated as negative). For certain calculations, results are presented both including and excluding these samples, with justification for lower agreement when included. This indicates a robust statistical approach for handling results near the decision threshold.

2. Sample Sizes Used for the Test Set and Data Provenance

Total Clinical Samples: 630 clinical samples were used for the Clinical Evaluation and Method Comparison studies.

Provenance:

510 clinical samples were sourced from 10 BioBanks in the US (BioChain, BioReclamationIVT, Bioserve, ConversantBio, Cureline, DiscoveryLifeSciences, iSpecimen, Precision for Medicine, ProMedDx, and Vitrologic). These samples were selected based on diagnosis to reflect important conditions for the study.
120 serum samples were sourced from a German University Hospital to complement rare but important diagnoses (70 Wegener's Granulomatosis, 25 MPA, 25 Churg-Strauss Syndrome).
Retrospective/Prospective: The document does not explicitly state whether the studies were retrospective or prospective. However, the nature of acquiring samples from biobanks and the use of de-identified diagnoses strongly suggests a retrospective data collection approach for the main clinical sample set.
Healthy Donor Samples: An additional panel of 150 sera from healthy donors was used for the Expected Values/Reference Range study: 100 from Germany and 50 from the US.

The document states that the diagnosis criteria of the samples were in agreement with diagnostic standards used in the U.S and Germany (e.g., ACR criteria), and that the US sample set was selected to contain different ethnic groups to reflect the US population. All samples were checked for purity, volume, and contaminations and deemed suitable for the study.

3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications

The ground truth for the clinical sample set was established based on the "diagnosis criteria of the different samples [that] have been made in agreement with diagnostic standards used in the U.S and Germany." A written statement from different serum suppliers is available on request. This implies that the initial diagnosis (ground truth) was established by medical professionals (e.g., clinicians) at the originating institutions (biobanks, university hospital) based on clinical and laboratory findings, prior to their inclusion in this study. The document does not specify the number or specific qualifications of these initial diagnosing experts.

For the subsequent "reading" or "evaluation" of slides within the various performance studies (e.g., Within-Lab, Between-Lab, Method Comparison, Stability studies), two independent readers/experts were consistently used. The qualifications of these readers are generally referred to as "trained personnel" or "trained operator." For instance, the intended use statement explicitly says "All suggested results obtained with the HELIOS AUTOMATED IFA SYSTEM must be confirmed by trained personnel." and "The device is for use by a trained operator in a clinical laboratory setting." Specific details on years of experience or board certification (e.g., "radiologist with 10 years of experience") are not provided for these internal study readers.

4. Adjudication Method for the Test Set

For the "reading" or "evaluation" of slides in the various performance studies:

The studies consistently involved two independent readers.
The results of these two readers were often calculated and presented separately as well as combined.
There is no explicit mention of an "adjudication" process (e.g., a 2+1 or 3+1 method) where a third, senior expert would resolve discrepancies between the two initial readers to establish a final ground truth for the study. Instead, the analysis focuses on the agreement between the readers and their agreement with either the expected reference values (for analytical studies) or comparison methods (for clinical studies). The concept of "User Confirmation" (Method B) implies that human oversight is always required for the automated results, but not necessarily a formal adjudication of discordant human reads.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done

Yes, a multi-reader, multi-case (MRMC) comparative effectiveness study was done. This is evident in the "Method Comparison of Method A, B, and C and clinical study" section (pages 85-96). This study compared:

Method C (Manual): Manual processing and manual reading by two independent readers.
Method B (Reader Confirmation): Automated processing/imaging by HELIOS, with manual reading of digital images by two independent readers.
Method A (HELIOS): Automated processing/imaging by HELIOS, with automated positive/negative classification by the HELIOS Vasculitis Pattern Plus software.

The study was conducted at three different study sites (two US, one German) using the entire 630-sample set, with two independent readers at each site for manual and reader confirmation methods.

Effect Size (AI vs. Human-in-the-Loop):
The document does not present the effect size in terms of how much human readers improve with AI vs. without AI assistance. Instead, it compares the performance (agreements and diagnostic sensitivities/specificities) between:

Human reading of traditionally processed slides (Method C).
Human reading of AI-processed images (Method B).
AI-only classification (Method A).

The statement that "All suggested results obtained with the HELIOS AUTOMATED IFA SYSTEM must be confirmed by trained personnel" (intended use) and the acceptance criteria for Method A (lower agreement accepted for automated-only results, e.g., >70% compared to >90% for human reads) consistently emphasize that the AI is an aid that requires human confirmation. The data implicitly supports that humans (Methods B and C) perform better than the standalone AI (Method A) in certain aspects (higher agreement percentages, higher diagnostic performance metrics for Human vs HELIOS in Formalin especially). For example, in ANCA Formalin, Method C vs A pattern agreement was only 69.7% while C vs B was 82.1-87.5%, highlighting the current limitations of standalone AI pattern recognition and the value of human reading (even with automated imaging).

6. If a Standalone (i.e. algorithm only without human-in-the-loop performance) Was Done

Yes, standalone performance of the algorithm (HELIOS Vasculitis Pattern Plus software, referred to as Method A) was evaluated.

Method A processed slides automatically, acquired images, and performed automated reading/interpretation.
Its performance was compared against manual reading (Method C) and human reading of automated images (Method B) in the "Method Comparison of Method A, B, and C" study.
As noted in point 5, the acceptance criteria for Method A were lower (e.g., ≥ 70% agreement) compared to the human-in-the-loop methods (≥ 90%). The results show that Method A generally achieved these lower thresholds but performed less ideally for pattern recognition.

7. The Type of Ground Truth Used

The ground truth used several types throughout the studies:

Clinical Diagnosis (Outcomes Data / Expert Consensus): For the diagnostic sensitivity and specificity calculations (Method comparison against predicate, and Method A, B, C comparison), the ground truth for patient samples was their established "diagnosis" (e.g., ANCA-associated vasculitis (AAV), other diseases like SLE, RA, etc.). This diagnosis was made in agreement with US and German diagnostic standards (e.g., ACR criteria), implying a form of expert consensus based on clinical and laboratory findings.
Expected Results/Reference Values (Expert Consensus): For analytical performance studies (e.g., Precision, Stability, Carry Over, Time Extension, Interfering Substances), ground truth was often defined as "expected results" or "expected values" for specific samples (e.g., positive/negative status, specific pattern, fluorescence intensity). These expected values were likely established by experienced operators / experts during the initial characterization of the control and study samples. The repeated use of "correctly found" implies agreement with a pre-established reference.
Negative Healthy Donor Panel: For the "Expected Values/Reference Range" study, healthy donor samples confirmed to be negative for ANCA were used to establish a reference range, implicitly serving as a negative ground truth.

There is no mention of pathology or direct biopsy results as ground truth, which is typical for diagnoses like vasculitis given the nature of ANCA testing.

8. The Sample Size for the Training Set

The document does not provide a specific sample size for the "training set" of the HELIOS AUTOMATED IFA SYSTEM. The provided information focuses on the validation of the device, particularly the performance evaluation of the final device using various test sets. The software's pattern recognition uses "SVM (Support Vector Machine) technology," which implies a machine learning approach. However, details on the dataset used to train this SVM model are not disclosed in this document.

9. How the Ground Truth for the Training Set Was Established

Similarly, since the training set details are not provided, the method for establishing its ground truth is also not explicitly described. For machine learning models like SVMs used in pattern recognition, the training data would typically be images with associated labels (ground truth) that are a result of expert annotation or consensus. Given the context of manual reading by "trained personnel" and the need for "confirmation by trained personnel," it is highly probable that the ground truth for any training set would have been established by multiple expert pathologists or laboratory professionals specializing in indirect immunofluorescence interpretation, likely through a consensus or adjudication process. However, this is an inference based on industry practice and the provided context, not a direct statement in the document.

Ask a Question

Ask a specific question about this device

K Number

K172348

Device Name

AESKUSLIDES nDNA (Crithidia luciliae), AESKUSLIDES nDNA (Crithidia luciliae) Demo Kit, AESKUSLIDES nDNA (Crithidia luciliae) Bulk kit x5, AESKUSLIDES nDNA (Crithidia luciliae) Bulk kit x10

Manufacturer

Aesku Diagnostics GmbH & Co. KG

Date Cleared

2018-02-16

(197 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

k153117,K880742

Predicate For

N/A

Intended Use

AESKUSLIDES® nDNA (Crithidia luciliae) is an indirect immunofluorescence assay utilizing Crithidia luciliae coated slides as a substrate for the qualitative and/or semi-quantitative determination of antibodies to native double stranded DNA (dsDNA) in human serum. This in vitro diagnostic assay is used as an aid for the diagnosis of Systemic Lupus Erythematosus (SLE) in conjunction with other clinical and laboratory findings. The assay can be processed manually and analyzed at the microscope or processed and analyzed with HELIOS® AUTOMATED IFA SYSTEM. All suggested results obtained with the HELIOS® AUTOMATED IFA SYSTEM must be confirmed by trained personnel.

Device Description

AESKUSLIDES® nDNA (Crithidia luciliae) is an indirect immunofluorescence assay utilizing Crithidia luciliae coated slides as a substrate for the qualitative and/or semiquantitative determination of antibodies to native double stranded DNA (dsDNA) in human serum.

Each kit contains (Quantity depends on product variant):

-Slides, each containing 10 wells coated with Crithidia Luciliae cells
4.0 ml vial containing Fluorescein (FITC) labelled Anti-human Antibody lgG conjugate in a solution of BSA, ready for use
-0.5 ml vial of positive control containing human serum (diluted), ready for Use
0.5 ml vial of negative control containing diluted human serum, ready for use -
-8.0 ml vial of mounting medium containing a solution of glycerol and PBS, ready for use
70 ml bottle of sample buffer, containing BSA, PBS and ready for use -
-100 ml bottle of wash buffer, concentrated buffer 1:10 in distilled water, containing BSA, PBS.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study information for the AESKUSLIDES® nDNA (Crithidia luciliae) device, based on the provided document:

Acceptance Criteria and Device Performance Study for AESKUSLIDES® nDNA (Crithidia luciliae)

1. Table of Acceptance Criteria and Reported Device Performance

The document presents several studies with specific acceptance criteria. Below is a summary for the Between-Lab Precision Study and the Lot-to-Lot Precision Study, as these provide clear, quantitative acceptance criteria and corresponding results. Other studies, like Serum Stability and Carryover, also met their respective criteria (e.g., 100% agreement, no significant deviations).

Between-Lab Precision Study (Excluding Borderline Samples - Method B: Reader Confirmation)

Type of Agreement	Acceptance Criteria	Reported Device Performance (All Sites, excluding borderline)
Positive Agreement	> 90%	97.9% (96.9 - 98.5)% CI
Negative Agreement	> 90%	99.4% (98 - 99.8)% CI
Overall Agreement	> 90%	98.2% (97.4 - 98.8)% CI

Between-Lab Precision Study (Excluding Borderline Samples - Method C: Manual)

Type of Agreement	Acceptance Criteria	Reported Device Performance (All Sites, excluding borderline)
Positive Agreement	> 90%	99.5% (99 - 99.8)% CI
Negative Agreement	> 90%	99.7% (98.4 - 100)% CI
Overall Agreement	> 90%	99.6% (99.1 - 99.8)% CI
Fluorescence Intensity Agreement	> 90%	97.5% (96.6 - 98.1)% CI

Lot-to-Lot Precision Study (Combined Readers)

Type of Agreement	Acceptance Criteria	Reported Device Performance (Combined Readers)
Positive Agreement	> 90%	100% (99.3 - 100)% CI
Negative Agreement	> 90%	100% (96.9 - 100)% CI
Overall Agreement	> 90%	100% (99.4 - 100)% CI
Fluorescence Intensity Agreement	> 90%	100% (99.4 - 100)% CI

2. Sample Size Used for the Test Set and Data Provenance

The primary clinical evaluation and method comparison studies used 776 clinical samples.

Test Set Size: 776 samples.
Data Provenance:
- 746 samples were obtained from 10 US BioBanks (BioChain, BioReclamationIVT, Bioserve, ConversantBio, Cureline, DiscoveryLifeSciences, iSpecimen, Precision for Medicine, ProMedDx, and Vitrologic).
- 30 serum samples were from a German University Hospital (used to complement rare diagnoses like Vasculitis).
Retrospective/Prospective: The samples were collected from BioBanks, indicating they are retrospective samples with pre-existing diagnoses.
Sample Characteristics: The samples were selected to reflect various diagnoses relevant to the study (e.g., 297 SLE, 479 other diseases) and different ethnic groups in the US population (White, Black/Black African, Asian, Hispanic). Serum stability over multiple freeze-thaw cycles and long-term storage was also evaluated and deemed acceptable.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications

The term "ground truth" for the diagnosis of the clinical samples (used in the method comparison studies) was established based on diagnostic standards used in the U.S. and Germany, such as ACR criteria. A written statement from different serum suppliers confirmed this.

For evaluation of the device performance within the studies (e.g., precision studies, clinical study), the ground truth for individual results was typically established by:

Two independent readers.
Qualifications of experts: The document consistently refers to "trained personnel" and "trained readers" to perform manual microscopy and confirm automated results. Specific certifications or years of experience (e.g., "Radiologist with 10 years of experience") are not explicitly provided in this document.

4. Adjudication Method for the Test Set

For studies involving human readers:

Results were analyzed by two independent readers.
The document implies a consensus-based approach or separate reporting of each reader's results for evaluation, but a formal adjudication method (like "2+1" or "3+1") where a third reader resolves discrepancies is not explicitly stated. For instance, in the Between-Lab and Within-Lab Precision studies, results for two readers were calculated separately and then combined. In cases of discrepancy (e.g., sample S1 in Within-Lab precision for Method B, where Reader 1 found 30 Negatives and Reader 2 found 6 Negatives and 24 Positives), the combined percentage reflects this disagreement rather than a formal adjudication.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

The document does not explicitly describe a formal MRMC comparative effectiveness study in the sense of comparing human reader performance with and without AI assistance to quantify an "effect size" of improvement.

Instead, it evaluates the agreement between:

Method C (Manual): Manual processing and manual reading by human readers.
Method B (Reader Confirmation): Automated processing and automated imaging, followed by manual reading of digital images by human readers.
Method A (HELIOS): Automated processing, automated imaging, and automated software interpretation (with required human confirmation).

The study compared the performance and agreement between these methods. It highlights that Method A (software-only interpretation) had lower agreement and sensitivity compared to human interpretation (Method B and C), underscoring the need for human confirmation. However, it does not quantify how much human readers improve when using AI as an assistance tool, but rather assesses the standalone performance of the AI component and the human element with and without automated processing.

6. Standalone (Algorithm Only) Performance

Yes, standalone performance was done:

Method A (HELIOS): This method represents the algorithm's standalone performance, where the "HELIOS DNA Pattern Plus Software" performs positive/negative classification.
Results: For Method A (HELIOS suggestion, excluding borderline samples) in the Between-Lab Precision Study:
- Positive Agreement: 72.7% (69.1 - 76)% CI (across all sites)
- Negative Agreement: 90% (84.7 - 93.6)% CI (across all sites)
- Overall Agreement: 76.5% (73.5 - 79.3)% CI (across all sites)
  The document notes that acceptance criteria for Method A were met for negative and overall agreement, but positive agreement was sometimes below 70% for individual site comparisons, explaining that this was due to out-of-focus images that the software couldn't interpret as positive. This reinforces the requirement for human confirmation.

7. Type of Ground Truth Used (Clinical Samples)

For the 776 clinical samples used in the method comparison studies, the "ground truth" for patient diagnosis (e.g., SLE, other rheumatic diseases, autoimmune liver diseases, infections, leukemia) was established based on diagnostic standards used in the U.S. and Germany (e.g., ACR criteria). This implies a clinical diagnosis based on a combination of clinical findings and existing laboratory tests, rather than a single definitive gold standard like pathology or outcome data.

For validation within the performance studies (e.g., precision, stability), the "expected result" (positive/negative, fluorescence intensity) of specific control or serum samples was pre-defined or derived from manual readings by trained personnel, which can be considered an expert consensus type of ground truth for analytical performance.

8. Sample Size for the Training Set

The document does not provide information regarding the sample size used for training the HELIOS DNA Pattern Plus Software (Method A). It mainly focuses on the performance evaluation of the device in various settings.

9. How the Ground Truth for the Training Set Was Established

Since information on the training set sample size is absent, how its ground truth was established is also not specified in this document.

Ask a Question

Ask a specific question about this device

Page 1 of 1