Search Results
Found 110 results
510(k) Data Aggregation
(98 days)
Post-NSE
The GT Metabolic MagDI System is intended for use in the creation of side-to-side duodeno-ileal anastomoses in minimally invasive and laparoscopic surgery. Once wound strength is sufficient to maintain the anastomosis, the device is passed from the body. The effects of this device on weight loss were not studied.
The GT Metabolic MagDI System is intended for use in adult patients > 21 vears.
The MagDI System is comprised of two (2) GT Metabolic DI Magnet ("Magnet") devices delivered sequentially with a minimally invasive GT Metabolic Delivery System ("Delivery System"). Class I magnetic surgical instruments (GT Metabolic Laparoscopic Positioning Device: FDA Listing #D512834) are used to position the Magnets to the target anastomosis locations in the duodenum and ileum and connect the two Magnets. The device provides a method for the creation of a round (oval/circular) compression anastomosis.
After a period of approximately 7-21 days, a compression-induced necrosis of the tissue between the Magnets occurs and the whole device, together with the necrosed tissue that was compressed by the Magnets, detaches, and is naturally expelled with the stool.
The MagDI System components (Magnets and Delivery System), as shown in Figures 1 and 2 below, are provided sterile and are for single use.
Acceptance Criteria and Device Performance Study for MagDI System
1. Table of Acceptance Criteria and Reported Device Performance
The clinical study's primary endpoint focused on feasibility and performance. The acceptance criteria were:
Protocol Feasibility/Performance Criteria | Acceptance Criteria (target) | Reported Device Performance |
---|---|---|
Placement of the device with ≥90% alignment of Magnets | Successfully placed in all subjects | 49 (100%) |
Passage of the device without invasive re-intervention | Passage without re-intervention | 49 (100%) |
Creation of a patent anastomosis confirmed radiologically | Confirmed radiologically | 49 (100%) |
Note: The document also details extensive pre-clinical (bench and animal) testing with their own acceptance criteria, all of which were reported as "Pass." For brevity, only the clinical performance acceptance criteria are included in this table.
2. Sample Size Used for the Test Set and Data Provenance
-
Sample Size: 49 subjects.
-
Data Provenance: The data was collected from a multi-center, open-label, two-stage clinical study (MAGNET Study, GTM-001 / NCT05322122). The study was conducted retrospectively (based on the date of data closure) across four centers in:
- Belgium
- Canada
- Republic of Georgia
- Spain
The study involved follow-up durations of 3, 6, 9, and 12 months, indicating a prospective data collection approach from the start of the study, even though the report itself is a summary of already collected data.
3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications
The document does not explicitly state the number of experts or their specific qualifications (e.g., radiologist with X years of experience) used to establish all aspects of clinical ground truth for the test set. However, for radiological confirmation of patent anastomoses, it can be inferred that qualified radiologists were involved. For surgical assessments and adverse event adjudication, surgeons and other medical professionals at the study sites were involved.
4. Adjudication Method for the Test Set
The document does not explicitly describe a specific adjudication method like "2+1" or "3+1" for establishing ground truth for the test set outcomes. However, safety outcomes, specifically the relationship of adverse events to the study device and procedure, were classified as "Possible, Probable, Definite, or Indeterminate" by clinicians. Events assessed as "probable or definite" were categorized as "Related" for causality in the report. This implies an internal adjudication process based on clinical assessment at the study sites.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
No Multi-Reader Multi-Case (MRMC) comparative effectiveness study was mentioned for the MagDI System. The device's primary clinical study did not compare its performance against human readers with or without AI assistance. The study design focused on the feasibility and safety of the device itself.
6. Standalone (Algorithm Only) Performance Study
The MagDI System does not contain software, as explicitly stated in the document. Therefore, a standalone (algorithm only) performance study was not applicable and not performed. The device is a mechanical system.
7. Type of Ground Truth Used
The ground truth for the clinical effectiveness endpoints was primarily based on:
- Radiological confirmation: To confirm the creation of a patent anastomosis.
- Direct observation/Clinical assessment: For device placement, alignment, and natural passage.
- Clinical assessment/Medical records: For the incidence and severity of adverse events, hospital stay, and device expulsion time.
8. Sample Size for the Training Set
The clinical study (MAGNET Study) served as the primary data for evaluating the device's performance related to its indications for use. There is no mention of a separate "training set" in the context of machine learning, as the device does not employ AI/machine learning. The 49 subjects in the clinical study are effectively the "test set" for regulatory evaluation.
9. How Ground Truth for the Training Set Was Established
As the device does not use AI/machine learning and thus has no "training set" in that context, this question is not applicable. The clinical study data was collected and evaluated as described in point 7.
Ask a specific question about this device
(265 days)
Post-NSE
The Sofia 2 SARS Antigen+ FIA is a lateral flow immunofluorescent sandwich assay that is used with the Sofia 2 instrument for the rapid, qualitative detection of SARS-CoV-2 nucleocapsid protein antigens directly in anterior nasal swab specimens from individuals with signs and symptoms of upper respiratory infection (i.e., symptomatic) when testing is started within 6 days of symptom onset. The test is intended for use as an aid in the diagnosis of SARS-CoV-2 infections (COVID-19) in symptomatic individuals when tested at least twice over three days with at least 48 hours between tests.
The test does not differentiate between SARS-CoV and SARS-CoV-2.
A negative test result is presumptive, and it is recommended these results be confirmed by a molecular SARS-CoV-2 assay. Negative results do not preclude SARS-CoV-2 infections and should not be used as the sole basis for treatment or other patient management decisions.
Positive results do not rule out co-infection with other respiratory pathogens.
Performance characteristics for SARS-CoV-2 were established during the 2021-2022 SARS-CoV-2 pandemic when SARS-CoV-2 Omicron was the predominant SARS-CoV-2 variant in circulation. When other SARS-CoV-2 virus variant are emerging, performance characteristics mav vary.
This test is intended for prescription use only and can be used in Point-of-Care settings.
The Sofia 2 SARS Antigen+FIA is based upon a lateral flow technology that employs immunofluorescence technology in a sandwich design that is used with Sofia 2 to detect nucleocapsid protein from the SARS-CoV-2 virus in human anterior nasal swab specimens.
The patient sample is placed in the Reagent Tube, during which time the virus particles in the sample are disrupted, exposing internal viral nucleoproteins. After disruption, the sample is dispensed into the Test Cassette sample well. The Test strip is composed of the following biochemical components dried and immobilized onto the nitrocellulose membrane: 1) sample pad that receives the specimen; 2) a label pad that contains detection fluorescent micro-particles, coated with monoclonal antibodies that are specific for SARS-CoV-2 nucleocapsid antigen: 3) embedded monoclonal antibodies specific for SARS-CoV-2 nucleocapsid antigen to capture the antigen-microparticle complex at the test line location. The sample pad facilitates migration of the sample fluid across the nitrocellulose strip into the absorbent pad. The test strip also contains a desiccant that does not participate in the assay but serves as a stabilizing agent during storage.
Sample is applied to in the sample well and migrates through a test strip, then passes through the test and control lines. If SARS-CoV-2 viral antigen is present, they will be bound by the fluorescent microparticles in the label pad region, forming an antigen-microparticle complex. The test line is coated with monoclonal antibodies that are specific to SARS-CoV-2 nucleocapsid antigen and is intended to capture the antigen-microparticle complex. If SARS-CoV-2 viral antigen is not present, the fluorescent microparticles will not be trapped by the capture antibodies nor detected by Sofia 2.
The Sofia 2 SARS Antigen+FIA employs antibody-tagged microparticles dyed with a fluorescent compound, to be detected and read by the Sofia 2 reader instrument. The Sofia 2 analyzers automatically scan/image the test strip, collect and analyze the fluorescence data, and then calculate and report the result as either positive, negative, or invalid.
Additionally, the Sofia 2 Antigen+ FIA utilizes a reference line for the Sofia 2 reader (to locate the test line and negative control line) and a procedural control (to assess for sample presence and adequate sample flow). No colored lines will be visible in the test window of the fluorescent assay cassette, thereby preventing visual interpretation of the test results. The operator must use the Sofia 2 analyzer to obtain a test result.
The Sofia SARS Antigen FIA Control Swabs are intended to be used as quality control samples representative of positive and negative test samples, to demonstrate that the reagents are functional and that the assay procedure is correctly perform.
Acceptance Criteria and Device Performance for Sofia 2 SARS Antigen+ FIA
The Sofia 2 SARS Antigen+ FIA is a qualitative lateral flow immunoassay designed for rapid detection of SARS-CoV-2 nucleocapsid protein antigens in anterior nasal swab specimens. The following details outline the acceptance criteria and the studies conducted to prove the device meets these criteria.
1. Acceptance Criteria and Reported Device Performance
Study Trait | Acceptance Criteria (Implicit from Study Design and Desired Performance) | Reported Device Performance |
---|---|---|
Precision/Repeatability (Intra-site) | - High Negative samples (0.04 x LoD) should demonstrate an expected negative agreement (e.g., >90%). |
- Low Positive samples (1 x LoD) should demonstrate an expected positive agreement (e.g., >95%).
- Moderate Positive samples (3 x LoD) should demonstrate high expected positive agreement (e.g., >95%).
- Zero invalid test results throughout the study (or very low %). | - Negative samples: 99.4% expected negative agreement (159/160)
- High Negative samples (0.04 x LoD): 95.0% expected negative agreement (152/160)
- Low Positive samples (1 x LoD): 98.1% expected positive agreement (157/160)
- Moderate Positive samples (3 x LoD): 99.4% expected positive agreement (159/160)
- 0 invalid test results (out of 640 replicates) |
| Reproducibility (Inter-site)| - High Negative samples (0.04 x LoD) should demonstrate reasonable negative agreement across sites and operators. - Low Positive samples (1 x LoD) should demonstrate high positive agreement across sites and operators.
- Moderate Positive samples (3 x LoD) should demonstrate high positive agreement across sites and operators.
- Zero invalid test results throughout the study (or very low %). | - Negative samples: 100.0% expected negative agreement (120/120)
- High Negative samples (0.04 x LoD): 55.0% expected negative agreement (66/120) - Note: This is lower than typical ideal scenarios, but likely deemed acceptable given the nature of a "high negative" near the detection limit.
- Low Positive samples (1 x LoD): 99.2% expected positive agreement (119/120)
- Moderate Positive samples (3 x LoD): 99.2% expected positive agreement (119/120)
- 0 invalid test results (out of 480 samples) |
| Analytical Specificity (Cross-reactivity & Interference) | - No cross-reactivity with a defined panel of common respiratory pathogens (bacteria, viruses, fungus) at specified concentrations. - No interference from common endogenous and exogenous substances found in nasal specimens at specified concentrations.
- 100% negative agreement in the absence of SARS-CoV-2.
- 100% positive agreement in the presence of SARS-CoV-2. | - All 28 organisms/viruses tested showed 100.0% negative agreement (5/5 replicates) for cross-reactivity and 100.0% positive agreement (5/5 replicates) for interference.
- All 13 endogenous/exogenous substances tested showed 100.0% positive agreement (5/5 replicates) and 100.0% negative agreement (5/5 replicates). |
| Limit of Detection (LoD) | - The device should consistently detect SARS-CoV-2 at a specific low concentration (LoD) with high positivity (e.g., 95% or 100%) in confirmatory studies. - Negative clinical matrix should consistently result in negative readings. | - Confirmed LoD: 1.44 x 10^4 TCID50/mL.
- At confirmed LoD: 100% positivity (20/20 replicates) across both tested lots.
- NCM (Negative Clinical Matrix): 0% positivity (0/5 replicates). |
| High-dose Hook Effect | - No false negative results should be observed at very high concentrations of SARS-CoV-2. | - All spiked samples from 10X LoD up to a maximum virus concentration (unspecified highest concentration) were 100% positive (5/5 replicates each). No hook effect observed. |
| Inclusivity | - The device should detect various clinically relevant SARS-CoV-2 strains/variants (e.g., Delta, Omicron BA.1, BA.2) with high positivity. | - Heat-inactivated SARS-CoV-2 (isolate Italy-INMI1): 100.0% positivity (5/5 replicates) at 2.43E+05 TCID50/mL. - Heat-inactivated SARS-CoV-2 (Delta B.1.617.2): 100.0% positivity (5/5 replicates) at 1.00E+04 TCID50/mL.
- Heat-inactivated SARS-CoV-2 (Omicron BA.1): 100.0% positivity (5/5 replicates) at 2.36E+04 TCID50/mL.
- Heat-inactivated SARS-CoV-2 (Omicron BA.2): 100.0% positivity (5/5 replicates) at 8.22E+03 TCID50/mL. |
| Clinical Performance | - Demonstrate acceptable Positive Percent Agreement (PPA) and Negative Percent Agreement (NPA) compared to a highly sensitive RT-PCR comparator, within defined confidence intervals. The specific thresholds would be pre-defined by regulatory guidelines (e.g., FDA's EUA templates or De Novo requirements for diagnostics). While not explicitly stated as a numerical criterion, the presented results suggest approval, implying the performance met the agency's expectations for a Class II device. | - PPA (Clinical Sensitivity): 89.0% (97/109; 95% CI: 81.7% - 93.6%) - NPA (Clinical Specificity): 99.6% (470/472; 95% CI: 98.5% - 99.9%) |
2. Sample Size and Data Provenance
Test Set Sample Sizes:
- Precision/Repeatability: 640 replicates (160 replicates per analyte level/kit lot for 4 levels x 2 lots).
- Reproducibility: 480 replicates (120 replicates per analyte level x 4 levels).
- Analytical Specificity (Cross-reactivity & Interference):
- Cross-reactivity: 5 replicates per organism/virus (28 tested) = 140 replicates for negative agreement. 5 replicates per organism/virus + 2xLoD SARS-CoV-2 (25 tested, some "Not Tested" for interference) = 125 positive replicates for interference.
- Interfering Substances: 5 replicates per substance (13 tested) for both positive and negative agreement = 130 replicates.
- Limit of Detection (LoD):
- Preliminary LoD: 5 replicates per dilution (6 dilutions) x 2 lots = 60 replicates.
- Confirmatory LoD: 20 replicates at preliminary LoD concentration for each of 2 lots = 40 replicates.
- High-dose Hook Effect: 5 replicates per concentration (4 concentrations) = 20 replicates.
- Inclusivity: 5 replicates per strain/variant (4 tested) = 20 replicates.
- Clinical Study (Accuracy): 581 evaluable subjects.
Data Provenance:
- Country of Origin: Not explicitly stated for analytical studies, but given the FDA review, it is implicitly expected to be a regulated environment. For the clinical study, it was a multi-center study in a "CLIA-waived" setting, which refers to US clinical laboratories.
- Retrospective or Prospective:
- Analytical Performance Studies (Precision, Reproducibility, Analytical Specificity, LoD, Hook Effect, Inclusivity): These are typically prospective, lab-controlled experiments with contrived samples designed specifically for the study.
- Clinical Studies: "multi-center, prospective study conducted from August 2021 to November 2022."
3. Number of Experts and Qualifications for Ground Truth
- Analytical Studies: For analytical studies (Precision, Reproducibility, LoD, Cross-reactivity, Interference, Hook Effect, Inclusivity), the ground truth is established by the known concentrations of spiked analytes (e.g., heat-inactivated SARS-CoV-2, other microbes, interfering substances). This does not involve human experts establishing ground truth beyond standard laboratory technician expertise in preparing and measuring these concentrations.
- Clinical Study: The ground truth for the clinical study was established by an "highly sensitive Emergency Use Authorization (EUA) authorized RT-PCR comparator assay." This is a laboratory-based molecular test, often considered the gold standard for SARS-CoV-2 detection, and does not involve human experts in the conventional sense of image adjudication or clinical consensus. The RT-PCR results themselves serve as the ground truth.
4. Adjudication Method for the Test Set
- Analytical Studies: Not applicable. Ground truth is determined by precise laboratory methods and known concentrations.
- Clinical Study: Not applicable. The comparator RT-PCR assay is a definitive laboratory test; there is no mention of a human-based adjudication process for the RT-PCR results.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- No, a MRMC comparative effectiveness study was not done.
- This device is an automated, instrument-read diagnostic assay (Sofia 2 SARS Antigen+ FIA). The results are generated by the instrument (Sofia 2 Analyzer) based on fluorescent signals and programmed algorithms, not by human interpretation of images or signals. Therefore, a human-in-the-loop study to assess how human readers improve with AI assistance is not relevant to this type of device. The study evaluates the standalone performance of the device compared to a reference method (RT-PCR).
6. Standalone Performance (Algorithm Only without Human-in-the-Loop)
- Yes, the performance data presented for the Sofia 2 SARS Antigen+ FIA are standalone performance, i.e., algorithm only without human-in-the-loop performance.
- The "Sofia 2 analyzers automatically scan/image the test strip, collect and analyze the fluorescence data, and then calculate and report the result as either positive, negative, or invalid." The operator only uses the analyzer to obtain the result; they do not visually interpret the test or make diagnostic decisions based on human observation of the test strip.
7. Type of Ground Truth Used
- Analytical Studies: The ground truth was based on known concentrations of heat-inactivated SARS-CoV-2 or other microorganisms/substances in controlled laboratory settings. This is a form of "spiked sample" ground truth.
- Clinical Study: The ground truth was established using an "highly sensitive Emergency Use Authorization (EUA) authorized RT-PCR comparator assay." This is a laboratory-based molecular diagnostic method, considered the gold standard for detecting SARS-CoV-2 RNA.
8. Sample Size for the Training Set
- No information is provided about a specific "training set" related to an AI/algorithm development in the context of machine learning. The algorithms referenced ("method-specific algorithms," "software specific cutoff," "specific algorithms") for the Sofia 2 analyzer relate to processing fluorescent signals and determining thresholds for positive/negative results, which are typically derived from extensive analytical characterization and optimization using laboratory-controlled samples, rather than a distinct "training set" in the machine learning sense often seen with image-based AI. The document implies these algorithms are developed and refined through the analytical performance studies (such as LoD, linearity, etc.) using various known concentrations.
9. How the Ground Truth for the Training Set Was Established
- As mentioned above, a traditional "training set" as understood in a machine learning context is not explicitly described. The "ground truth" for establishing the device's operational algorithms and cut-offs would have been determined through laboratory experiments with precisely known concentrations of analytes. This process involves:
- Identifying the Limit of Detection (LoD).
- Evaluating signal response across a range of concentrations.
- Using reference lots and known values during development to establish parameters for calculations and cut-offs.
- "Final cut-off values were further validated as part of the analytical and clinical studies." This suggests an iterative process of development, testing, and validation against a known ground truth (spiked samples for analytical studies, RT-PCR for clinical).
Ask a specific question about this device
(311 days)
Post-NSE
The Tornier Pyrocarbon Humeral Head associated with the Tornier Flex Stem is indicated for use as a replacement of deficient humeral heads disabled by:
- Non-inflammatory degenerative joint diseases (osteoarthritis, avascular necrosis) .
- . Traumatic arthritis.
The Tornier Pyrocarbon Humeral Head Shoulder Prosthesis, combined with the Tornier Flex Humeral Stem, are to be used only in patients with an intact or reconstructable rotator cuff and if the native glenoid surface is intact or sufficient, where they are intended to increase mobility, stability, and relieve pain.
Note: The coated humeral stem is intended for cementless use. The noncoated humeral stem is for cemented use only
The Tornier Pyrocarbon Humeral Head is a prescription use device that is comprised of the pyrolytic carbon (pyrocarbon) articulating surface and a cobalt chromium alloy double taper neck. The humeral head is provided pre-assembled to the double taper to the end user and is compacted onto 510(k) cleared compatible humeral stems (K151293) for replacement of deficient humeral heads disabled by noninflammatory arthritis, or traumatic arthritis. The pyrocarbon articulating surface is made of a graphite substrate core, coated with a layer of pyrolytic carbon deposited onto the substrate via chemical vapor deposition. The pvrocarbon articulating surface is pressed into the cobalt chromium alloy double taper neck during the manufacturing process, is provided as a singular construct to the end user, and is not intended to be disassembled by the end user. Compatible monoblock humeral stems are available in titanium plasma spray coated or uncoated versions. The humeral stems are designed with a female taper connection to accept the mating male taper connection of the pyrocarbon humeral heads.
The provided text describes the acceptance criteria and a clinical study that proves the Tornier Pyrocarbon Humeral Head device meets these criteria. However, it does not detail a study involving AI or human readers for diagnostic image analysis. Instead, the "study" referenced is a clinical trial evaluating the safety and effectiveness of a medical implant.
Therefore, many of the requested points related to AI model evaluation, such as "number of experts used to establish ground truth," "adjudication method," and "MRMC comparative effectiveness study," are not applicable to this document's content.
I will provide the information that is available in the document regarding the acceptance criteria and the clinical study of the implant.
Acceptance Criteria and Device Performance for Tornier Pyrocarbon Humeral Head (Hemiarthroplasty Implant)
The acceptance criteria for this medical device are primarily defined through bench testing (non-clinical performance) and clinical study endpoints (safety and effectiveness in patients).
1. Table of Acceptance Criteria and Reported Device Performance
A. Bench Testing (Non-Clinical Mechanical Performance)
Acceptance Criteria (Performance Criteria) | Reported Device Performance (Results) |
---|---|
Construct Fatigue Endurance: Required to survive 5 million cycles to pre-specified test parameters without any cracks, breakage, damage, or dissociation. | All tested implants survived 5 million cycles without any cracks, breakage, damage, or dissociation from the stem. |
Taper Disassembly Resistance - Axial Pull-off: No pre-determined acceptance criteria defined; results compared to another humeral head with the same intended use. | The minimum pull-off load for the Tornier Pyrocarbon Humeral Head exceeded the pull-off load of another humeral head. |
Taper Disassembly Resistance - Torque-off: Torsional resistance force between pyrocarbon articulating surface and CoCr double taper neck must exceed anticipated clinically relevant loading conditions including an appropriate factor of safety. | All samples met the pre-determined acceptance criteria for torsional resistance. |
Taper Disassembly Resistance - Lever-off: No pre-determined acceptance criteria defined; results compared to another humeral head with the same intended use. | The minimum lever-off load for the Tornier Pyrocarbon Humeral Head exceeded the lever-off load of another humeral head. |
Fretting and Corrosion Resistance: No pre-determined acceptance criteria defined; visual scoring, ion release analysis, and particulate analysis results compared to another humeral head with the same intended use. | Qualitative damage determined by visual scoring, ion release analysis, and particulate analysis demonstrated comparable performance to another humeral head with the same intended use. |
Humeral Head Burst Testing (Static Compression): A safety factor applied to the mean fatigue load to determine a minimum acceptance criteria for burst. (Safety factor derived from FDA guidance for ceramic hip systems). | All samples met the pre-determined acceptance criteria. |
Humeral Head Subcritical Crack Propagation: A safety factor applied to the mean fatigue load to determine a minimum acceptance criteria for burst. (Safety factor derived from FDA guidance for ceramic hip systems and ISO standards). | All samples met the pre-determined acceptance criteria. |
Third Body Wear: No pre-determined acceptance criteria defined; abrasive wear results compared to another humeral head with the same intended use. | Tornier Pyrocarbon Humeral Head demonstrated lesser surface roughening when exposed to an abrasive condition compared to another humeral head with the same intended use. Wear particulate analysis demonstrated wear particulates were consistent with wear particulates from other arthroplasty devices. |
Range of Motion (ROM): Flexion ≥ 90°, Extension ≥ 45°, Abduction ≥ 90°, Internal Rotation ≥ 90°, External Rotation ≥ 45° (per ASTM F1378 for shoulder prostheses). | All simulated constructs met the pre-determined acceptance criteria. |
Spring Impactor Testing: Performance of the instrument (e.g., spring stiffness and ability to impact the humeral head onto the stem) should not be impacted from repeated use, cleaning, or sterilization. | The spring impactor's performance was not impacted from extended cycles of simulated use, cleaning, or sterilization of the device. |
B. Clinical Performance (Primary Endpoint for Clinical Success at 24 Months)
Acceptance Criteria (Success Definition) | Reported Device Performance (Pyrocarbon Group) | Reported Device Performance (Control Group - for comparison) |
---|---|---|
A patient was considered a success if (all conditions met at 24 months): |
- Change in Constant score is ≥ 17;
- No revision surgery;
- No radiographic evidence of system disassembly or fracture;
- No system-related serious adverse event. | Composite Clinical Success (CCS):
- Intent to Treat (ITT): 82.7%
- Per Protocol (PP): 87.9%
Component Success Rates:
- Free of Revision: 98.1% (154/157)
- Constant Score improved 17+ points (among those with evaluable scores): 84.6% (121/143)
- Free of disassembly or fracture: 100.0% (157/157)
- Free of device related SAE: 96.8% (152/157) | Composite Clinical Success (CCS):
- Intent to Treat (ITT): 66.8%
- Per Protocol (PP): 63.1%
Component Success Rates:
- Free of Revision: 94.7% (160/169)
- Constant Score improved 17+ points (among those with evaluable scores): 73.1% (49/67)
- Free of disassembly or fracture: 100.0% (169/169)
- Free of device related SAE: 94.7% (160/169) |
2. Sample Size and Data Provenance for the Clinical Test Set
- Sample Size (Test Set):
- Pyrocarbon (Investigational) Group: 157 subjects enrolled.
- Control Group: 169 subjects selected after Propensity Score (PS) matching from a historical dataset.
- Data Provenance:
- Pyrocarbon Group: Prospective, multi-center, single-arm investigational study (IDE G140202 - Pyrocarbon IDE Study). Data collected from 18 sites within the US.
- Control Group: Retrospective, derived from the Aequalis Post-Market Outcomes Study dataset. The exact country of origin for the Aequalis dataset is not explicitly stated, but given context with US-based clinical trials, it is likely also primarily US data or from similar western healthcare systems.
3. Number of Experts and Qualifications for Ground Truth
This question is not applicable as the document describes a clinical trial for a medical implant, not an AI model requiring human expert labeling of data. The "ground truth" for the clinical study is the patient's actual clinical outcome, measured through direct observation (e.g., revision surgery, radiographic findings) and patient-reported outcomes (e.g., Constant score changes).
4. Adjudication Method for the Test Set
This question is not applicable in the context of diagnostic performance evaluation for an AI model. For the clinical study of the implant:
- The primary endpoint was a composite outcome, objectively defined.
- "Unanticipated Adverse Device Effects" were determined by an independent medical monitor.
- Clinical data collection and evaluation would have followed standard clinical trial protocols, typically involving investigators at sites and a data monitoring committee. Explicit "adjudication" in the sense of resolving disagreements among multiple human readers of image data is not relevant here.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was done
No, a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was not done. This type of study is specific to evaluating the diagnostic performance of AI or other tools when used by human readers (e.g., radiologists interpreting images). The study described is a clinical trial comparing a new implant to a historical control.
6. If a Standalone (Algorithm Only) Performance Study was done
No, this question is not applicable as the document is about a physical medical implant, not an AI algorithm. Its "standalone performance" is demonstrated through bench testing (mechanical performance, biocompatibility, sterility) rather than diagnostic accuracy.
7. The Type of Ground Truth Used
For the clinical study:
- Clinical Outcomes Data: This includes hard endpoints such as occurrence of revision surgery, radiographic evidence of system disassembly or fracture, and presence of system-related serious adverse events.
- Patient-Reported Outcomes (PROs): These are quantitative measures of patient experience and function, such as the Constant score, ASES score, SANE, EQ-5D, and VAS pain scale. Improvement in these scores contributes to the definition of "success."
For the bench testing:
- Direct Measurement/Observation: Mechanical properties are empirically measured (e.g., force to cause disassembly, visual inspection for cracks, measured ROM).
- Comparative Data: For some tests without absolute acceptance criteria (e.g., taper disassembly, fretting/corrosion, third-body wear), performance was compared to another cleared humeral head with the same intended use.
8. The Sample Size for the Training Set
This question is not applicable as the document describes a physical medical device (implant) and its clinical evaluation, not an AI model that requires a "training set" of data in the machine learning sense. The "training" for this device would refer to its manufacturing process and quality control, and the "data" is the clinical and bench testing data.
9. How the Ground Truth for the Training Set was Established
This question is not applicable for the reasons stated above.
Ask a specific question about this device
(578 days)
Post-NSE
Kerasave is indicated for storage of human corneas at 2-8°C for up to 14 days. It is intended for prescription (Rx) use by physicians or highly skilled personnel, such as Eye Bank operators.
Kerasave is made of a buffered corneal storage medium, which provides basic nutrients for cell maintenance during storage of donor corneas at 2-8°C for up to 14 days, at physiological pH. The device also includes antimicrobial agents. The antibiotics are dissolved in the solution and an antifungal agent is formulated as a tablet for stability reason and constitutes integral part of the device; it shall be dissolved in the medium prior to use.
Here's a summary of the acceptance criteria and the study proving the device meets those criteria, based on the provided text:
Acceptance Criteria and Device Performance for Kerasave
1. Table of Acceptance Criteria and Reported Device Performance
The provided document defines "Special Controls" which serve as the acceptance criteria for the Kerasave device. These are primarily evaluated through non-clinical performance testing.
Acceptance Criterion | Reported Device Performance |
---|---|
Non-clinical performance testing must demonstrate the device performs as intended under anticipated conditions of use. | Corneal Endothelial Cell Layer Preservation: Kerasave was non-inferior to Optisol-GS (FDA-cleared comparator) for Endothelial Cell Density (ECD) and Hexagonality (HEX) at Day 14. An exploratory analysis also showed non-inferiority for Coefficient of Variation (%CV) after removing an outlier. (Tables 4, 5, 7) |
(i) Following performance characteristics of the cornea following storage must be demonstrated: | |
(A) Endothelial cell density | Non-inferior to Optisol-GS at Day 14 (Table 4). |
(B) Endothelial cell morphology (pleomorphism: Hexagonality (HEX) and polymegathism: %CV) | HEX: Non-inferior to Optisol-GS at Day 14 (Table 5). %CV: Failed to statistically clear non-inferiority initially, but exploratory analysis (excluding outlier) showed non-inferiority at Day 14 (Table 6, 7). |
(C) Corneal transparency | No statistical difference in clarity scores between Kerasave and Optisol-GS at Day 14. However, Kerasave showed increased opacity (score 1 to 2) in 7/27 corneas, compared to 2/27 for Optisol-GS. Increased edema and folds/striae were observed in these corneas. (Table 9, and related text) |
(D) Central corneal thickness | CCT changes were similar between Kerasave and Optisol-GS arms, with no statistical difference (Table 8). |
(ii) Antimicrobial activity of the device must be demonstrated at the initial and maximum labeled storage time. | Demonstrated effective inhibition of bacterial growth (streptomycin and gentamicin) and fungal growth (Amphotericin B). Amphotericin B reduced Candida sp. population by 5-log and Fusarium sp. population by ~1-log. (Antimicrobial Evaluation section) |
(iii) Characterization of all preservatives, including antifungals, must include: | |
(A) Characterization of impurities, heavy metal analysis, concentration, and dissolution | Amphotericin B in Kerasave was identified as equivalent to the reference substance. Impurity profile met USP/EP limits. Heavy metals met acceptance ( |
Ask a specific question about this device
(4266 days)
Post-NSE
Ask a specific question about this device
(324 days)
Post-NSE
Natural Cycles is a stand-alone software application, intended for women 18 years and older, to monitor their fertility. Natural Cycles can be used for preventing a pregnancy (contraception) or planning a pregnancy (conception).
Natural Cycles is an over-the-counter web and mobile-based standalone software application that monitors a woman's menstrual cycle using information entered by the user and informs the user about her past, current and future fertility status. The following information is entered into the application by the user:
- daily basal body temperature (BBT) measurements
- information about the user's menstruation cycle (i.e., start date, number of days)
- optional ovulation or pregnancy test results
A proprietary algorithm evaluates the data and returns the user's fertility status.
Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:
1. Table of Acceptance Criteria and Reported Device Performance
Acceptance Criteria (Special Control) | Reported Device Performance |
---|---|
1. Clinical performance testing must demonstrate the contraceptive effectiveness of the software in the intended use population. | |
Specificity related to unintended pregnancy rate. | Clinical Study Results (v.3 algorithm, Sept 2017 - Apr 2018): |
- Method Failure Rate: 0.6 per 100 women-years. This means 0.6 out of 100 women using the application for one year get pregnant due to the application incorrectly displaying a green day when the woman is fertile.
- Perfect Use Pearl Index: 1 per 100 women-years. This includes method failures and failures of a chosen contraceptive method on red days.
- Typical Use Pearl Index: 6.5 per 100 women-years (95% CI: 5.9-7.1). This accounts for all possible reasons for pregnancy, including user behavior (e.g., unprotected intercourse on red days, failure of contraceptive method used on red days, and method failure).
Subgroup Analysis (Typical Use PI):
- Recent Hormonal Birth Control use (within 60 days): 8.6 (7.2-10.0)
- No Hormonal Birth Control use (within 12 months): 5.0 (4.3-5.7)
The study enrolled 15,570 women for a total exposure of 7,353 woman-years. 475 pregnancies were observed (584 worst-case). The "Fraction of Days that were Green" was 48.8%. |
| 2. Human factors performance evaluation must be provided to demonstrate that the intended users can self-identify that they are in the intended use population and can correctly use the application, based solely on reading the directions for use for contraception. | A usability study was conducted with (b) (4) users. The study confirmed:
- 98.9% of users were within the intended age range (18-45).
- Analysis of sexual activity on red days: 29% of women had sex on red days. Of these, 49% used condoms, 25% withdrawal, 9% abstention, etc. Only 4% used no protection and took the risk.
- When asked why no protection was used on red days, responses showed that a high percentage understood the directive (e.g., trying to conceive, mistakenly confirming withdrawal, IUD in place, sex not penetrative). Only 2% stated they didn't know red meant fertile.
- Comparison of pregnancy rates between "Prevent Mode" (contraception) users and "Plan Mode" (conception) users demonstrated that users understand the labeling and behave accordingly (low pregnancy rate in Prevent Mode, high in Plan Mode).
- The study was conducted OUS but deemed generalizable to the US population due to similar education levels, user ages, temperature variation, and cycle lengths. |
| 3. Software verification, validation, and hazard analysis must be performed... a. A cybersecurity vulnerability and management process... b. A description of the technical parameters of the software, including the algorithm... | Software Documentation: Major level of concern, with submitted documentation including Software/Firmware Description, Device Hazard Analysis, Software Requirement Specifications, Architecture Design Chart, Software Design Specifications, Traceability, Software Development Environment Description, Revision Level History, and Unresolved Anomalies. - Risk Analysis: Comprehensive, addressing hazards, causes, severity, and control methods.
- Verification & Validation: Acceptable protocols for unit, integration, and system levels provided.
- Cybersecurity: Addressed data confidentiality, integrity, availability, DoS attacks, and malware with controls and evidence of performance.
- Technical Parameters/Algorithm: Full characterization provided, including description of the algorithm that analyzes BBT and menstrual cycle data to detect ovulation and determine fertility status. |
| 4. Labeling must include specific warnings, instructions, and a summary of clinical validation. | Labeling via Instructions for Use manual (downloadable and in-app): - Warnings/Precautions: Included (e.g., no contraceptive method is 100% effective, use another form of contraception on specified days, factors affecting accuracy, cannot protect against STIs).
- Hardware/OS Requirements: Not explicitly detailed in the provided text but implied as part of software documentation.
- Instructions: Identifies and explains how to use, including required user inputs and interpreting outputs.
- Clinical Summary: Provides a summary of the clinical validation study and results, including effectiveness and comparison to other methods. |
2. Sample size used for the test set and the data provenance
- Sample Size for Test Set: 15,570 women
- Data Provenance:
- Country of Origin: 37 countries outside of the United States (OUS), with the majority from Sweden.
- Retrospective or Prospective: Prospective. Women were followed prospectively from September 1, 2017, to April 30, 2018. A retrospective analysis was also conducted to validate ovulation identification.
3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts
The provided text does not specify the number of experts or their qualifications for establishing ground truth for the test set's clinical outcomes (pregnancies).
4. Adjudication method for the test set
The provided text does not specify an explicit adjudication method for the test set. Pregnancies were detected via "pregnancy tests, via email follow-up or via the algorithm." The "worst case" pregnancy count was calculated by assuming pregnancy in women who left early with unknown status or where data indicated possible pregnancy without confirmation. This suggests a blend of user reporting and algorithmic inference for pregnancy detection, but not a formal expert adjudication process for each case.
5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance
No, a multi-reader multi-case (MRMC) comparative effectiveness study was not explicitly described. The study evaluates the device's standalone effectiveness as a contraceptive, which women use independently. It doesn't assess how human healthcare providers improve their diagnostic or decision-making ability with or without the AI's assistance.
6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done
Yes, a standalone study was done. The entire clinical study described (with the Pearl Index calculations) focuses on the Natural Cycles application's performance as a "stand-alone software application" for contraception. Women interact directly with the app, entering data, and the app provides fertility status. The effectiveness rates (method failure, perfect use, typical use Pearl Index) directly reflect the algorithm's performance in real-world use without a healthcare provider actively interpreting the algorithm's output for the user's daily contraceptive decisions.
7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.)
The ground truth for the primary outcome (contraceptive effectiveness) was outcomes data: specifically, pregnancy detection. This was identified "via pregnancy tests, via email follow-up or via the algorithm."
For the retrospective analysis on ovulation identification: the ground truth was "ovulation day was correctly identified whether using temperature plus LH test results or just temperature alone." This implies comparison to either a gold standard of combined physiological markers (temperature + LH tests) or an internal standard from the algorithm itself for validation.
8. The sample size for the training set
The provided text does not explicitly state the sample size for the training set used to develop the Natural Cycles algorithm. It mentions that Natural Cycles "utilized real-world data to evaluate the effectiveness of the current version of the algorithm (v.3)," referring to the 15,570 women study as the evaluation of the algorithm, rather than its training.
9. How the ground truth for the training set was established
The provided text does not explicitly describe how the ground truth for the training set was established. It states: "Natural Cycles has provided a full characterization of the technical parameters of the software, including a description of the algorithm that analyzes the patient's basal body temperature and menstrual cycle data to detect the day of ovulation and, by accounting for various sources of uncertainty, to determine the fertility status." This implies a biologically-based ground truth related to ovulation and fertile windows, likely established through extensive physiological research and potentially validated against various methods (e.g., hormone levels, ultrasound, BBT, LH tests) over time. However, the specific methodology for the training data ground truth is not detailed.
Ask a specific question about this device
(59 days)
Post-NSE
The Vitamin D 200M Assay for the Topaz System is intended for in vitro diagnostic use in the quantitative determination of total 25-hydroxyvitamin D (25-OH-D) through the measurement of 25-hydroxyvitamin D3 (25-OH-D3) and 25-hydroxyvitamin D2 (25-OH-D2) in human serum using LC-MS/MS technology by a trained laboratory professional in a clinical laboratory. The Assay is intended for use with the Topaz System. The Vitamin D 200M Assay for the Topaz System is intended to be used in conjunction with other clinical or laboratory data to assist the clinician in making individual patient management decisions in an adult population in the assessment of vitamin D sufficiency.
The Vitamin D 200M Assay for the Topaz System employs LC-MS/MS technology (Topaz System) in conjunction with reagents and sample preparation components to extract, separate by chromatography, detect, and quantify total Vitamin in human serum.
The Topaz System is a liquid chromatography-tandem mass spectrometry (LC-MS/MS) system intended to for use in a clinical laboratory environment to identify inorganic or organic compounds in human specimens by ionizing the compound under investigation and separating the resulting ions by means of an electrical and magnetic field according to their mass.
Each assay kit contains enough material for 1000 tests, and includes reagents, sample extraction and purification components, calibrators, controls and the Vitamin D 200M Assay specific file which contains all necessary parameters to process the assay.
This document describes the process by which a medical device, the Vitamin D 200M Assay for the Topaz System, received automatic Class III designation. The device is intended for quantitative determination of total 25-hydroxyvitamin D (25-OH-D) to assist clinicians in assessing vitamin D sufficiency.
Here's a breakdown of the acceptance criteria and the study proving the device meets them:
1. Table of Acceptance Criteria and Reported Device Performance
The acceptance criteria are derived from the special controls defined in 21 CFR 862.1840 for Total 25-hydroxyvitamin D Mass Spectrometry Test Systems. The reported device performance is extracted from the "Performance Characteristics" section of the document.
Acceptance Criteria (from 21 CFR 862.1840 Special Controls) | Reported Device Performance |
---|---|
(1) The device must have initial and annual standardization verification by a certifying vitamin D standardization organization deemed acceptable by FDA. | Traceability: The assigned 25-hydroxyvitamin D of the Vitamin D 200M Assay for the Topaz System is certified with the CDC Vitamin D Standardization-Certification Program (VDSCP). (Source: CDC website link provided, last accessed May 18, 2017). |
(2) The 21 CFR 809.10(b) compliant labeling must include detailed descriptions of performance testing conducted to evaluate precision, accuracy, linearity, interference, including the following: | Detailed descriptions are provided in the "L. Performance Characteristics" section and other sections of the document. |
(i) Performance testing of device precision must, at a minimum, use intended sample type with Vitamin D concentrations at medically relevant decision points. At least one sample in the precision studies must be an unmodified patient sample. This testing must evaluate repeatability and reproducibility using a protocol from an FDA-recognized standard. | Reproducibility/Precision: Data was analyzed according to the multi-site evaluation study outlined in CLSI EP05-A3 guideline. The study was performed at three sites. Five serum samples were assayed on five calendar days, with one run per day and five replicates per sample. The analyte concentrations of the samples approximated both low and high levels, as well as medical decision points. The sixth sample was a native patient sample (unmodified). |
Sample | Mean (ng/mL) | Repeatability SD | Repeatability %CV | Within-Laboratory SD | Within-Laboratory %CV | Reproducibility SD | Reproducibility %CV |
---|---|---|---|---|---|---|---|
1 | 14.9 | 0.57 | 3.8% | 1.05 | 7.0% | 1.10 | 7.3% |
2 | 13.7 | 0.65 | 4.7% | 0.80 | 5.8% | 0.80 | 5.8% |
3 | 31.0 | 1.28 | 4.1% | 2.03 | 6.5% | 2.03 | 6.5% |
4 | 67.5 | 3.58 | 5.3% | 3.99 | 5.9% | 5.85 | 8.7% |
5 | 100 | 6.37 | 6.3% | 6.46 | 6.4% | 10.4 | 10.4% |
Native Patient Sample | 28.4 | 1.44 | 5.1% | 2.04 | 7.2% | 2.10 | 7.4% |
(ii) Performance testing of device accuracy must include a minimum of 115 serum or plasma samples that span the measuring interval of the device and compare results of the new device to results of a reference method or a legally marketed standardized mass spectrometry based vitamin D assay. The results must be described in the 21 CFR 809.10(b)(12) compliant labeling of the device. | Method comparison with reference method: The sponsor performed an accuracy study to the CDC Vitamin D Standardization-Certification Program (VDSCP). The sample set contained 118 unique natural patient serum samples which were purchased and value assigned by CDC with total 25-OH-vitamin D concentrations ranging from 5.6 ng/mL to 133 ng/mL. Twelve specimens were contrived (10.2%). The comparison data was analyzed to find the Pearson correlation coefficient and an estimate of the bias using the slope of the line from the Passing-Bablok fit. |
Passing-Bablok regression results | |
---|---|
n | 118 |
Slope | 1.008 |
Intercept | -0.3949 |
Correlation Coefficient | 0.991 |
Range (ng/mL) | 5.6 – 133 ng/mL |
(iii) Interference from vitamin D analogs and metabolites including vitamin D2, vitamin D3, 1-hydroxyvitamin D2, 1-hydroxyvitamin D3, 3-Epi-25-Hydroxyvitamin D2, 3-Epi-25-Hydroxyvitamin D3, 1,25-Dihydroxyvitamin D2, 1,25-Dihydroxyvitamin D3, 3-Epi-1,25-Dihydroxyvitamin D2, and 3-Epi-1,25-Dihydroxyvitamin D3, 25, 26-Dihydroxyvitamin-D3, 24 (R), 25-dihydroxyvitamin-D3, 23 (R), 25-dihydroxyvitamin-D3 must be described in the 21 CFR 809.10(b)(7) compliant labeling of the device. | Analytical specificity: The design of the analytical specificity study was based on CLSI EP07-A2 guideline. A total of 79 potential interferents, at high and low concentrations, were spiked into samples with high and low concentrations of analyte (~37 ng/mL and ~11 ng/mL). Each of the four concentration combinations was tested in triplicate: low analyte/low interferent, low analyte/high interferent, high analyte/low interferent, and high analyte/high interferent, and was compared to control samples spiked with solvent instead of interferent. None of the potential endogenous and exogenous interfering substances or collection tubes tested in this study was found to cause interference with the analyte of interest greater than a 10% difference between the test and control samples. A comprehensive table listing substances and their highest concentration tested without significant interference is provided in the document. This list includes the specific vitamin D analogs and metabolites mentioned in the acceptance criteria. |
(3) The 21 CFR 809.10(b) compliant labeling must be supported by a reference range study representative of the performance of the device. The study must be conducted using samples collected from apparently healthy male and female adults at least 21 years of age and older from at least 3 distinct climatic regions within the United States of America in different weather seasons. The ethnic, racial, and gender background of this study population must be representative of the US population demographics. | Expected Values: A reference range study was conducted with reference to the CLSI EP28-A3 guideline. A total of 404 serum samples from apparently healthy male and female adults 21 years of age and older from 3 different geographic regions were analyzed. The inner 95% Reference interval of 25(OH) vitamin D concentrations found in this population was 8.6 to 49 ng/mL. (Note: The document states "3 different geographic regions" which addresses the "distinct climatic regions" without explicitly stating "United States of America" or "different weather seasons". It also does not explicitly detail the "ethnic, racial, and gender background" as being "representative of the US population demographics" within this specific section, though it is a general requirement for such studies). |
(4) The results of the device as provided in the 21 CFR 809.10(b) compliant labeling and any test report generated must be reported as only total 25-hydroxyvitamin D. | Indications For Use: The assay is intended for in vitro diagnostic use in the quantitative determination of total 25-hydroxyvitamin D (25-OH-D) through the measurement of 25-hydroxyvitamin D3 (25-OH-D3) and 25-hydroxyvitamin D2 (25-OH-D2). Test Principle: "Total Vitamin D concentrations is calculated as a total of the concentrations of 25-hydroxyvitamin D3 and 25-hydroxyvitamin D2." This directly supports reporting only total 25-hydroxyvitamin D. |
Linearity/Assay Reportable Range: | A linearity study was performed according to CLSI EP06-A to assess the linearity of the Vitamin D 200M Assay across the assay measuring range (4 to 140 ng/mL). The results of the linear regression analyses were: y = 0.9974x + 1.1737 R2 = 0.9984. The linearity study data support the claimed measuring range of 4.0 to 140 ng/mL. |
Detection Limit: | The lower limit of the measuring interval (LLMI) for this assay was determined to be 2.9 ng/mL for total 25-OH-Vitamin D. This was determined as the lowest concentration of analyte that achieved both the bias and precision goals ( |
Ask a specific question about this device
(103 days)
Post-NSE
Amplichek II is intended for use as an external assayed quality control material to monitor the performance of in vitro laboratory nucleic acid testing procedures for the qualitative detection of Methicillin Resistant Staphylococcus aureus, Methicillin Sensitive Staphylococcus aureus, Clostridium difficile and Vancomycin-resistant Enterococci performed on Cepheid GeneXpert Systems. This product is not intended to replace manufacturer controls provided with the device. This product is only for use with assays and instruments listed in the Representative Results Chart in this labeling.
Amplichek II (Assayed Microbiology Control) is manufactured at three levels, Levels 1, 2 and 3, for each analyte indicated in the package insert. Individual analyte values are listed in the package insert and are specific for the instrument system or method utilized. Each control is prepared in liquid form in a buffer solution with preservatives including 5chloro-2-methyl-2H-isothiazol-3-one at a concentration of 0.1%, stabilizers and added preparations of purified intact microorganisms grown in microbial culture. Source materials are chemically treated and processed to inactivate infectious agents.
The Amplichek II is an assayed quality control material for clinical microbiology assays. Its purpose is to monitor the performance of in vitro laboratory nucleic acid testing procedures for the qualitative detection of Methicillin-Resistant Staphylococcus aureus (MRSA), Methicillin-Sensitive Staphylococcus aureus (MSSA), Clostridium difficile (Cdiff), and Vancomycin-resistant Enterococci (VRE) on Cepheid GeneXpert Systems.
Here's an analysis of the acceptance criteria and the study that proves the device meets them:
1. Table of Acceptance Criteria and Reported Device Performance
The primary acceptance criteria for the Amplichek II, as demonstrated by the reproducibility studies, is the percent agreement with the expected result for each analyte at various levels (Negative, Level 1, Level 2, Level 3).
Acceptance Criteria (Implicit) | Reported Device Performance (Reproducibility Studies) |
---|---|
High percent agreement (ideally 100%) for expected positive and negative results across different lots, operators, days, and sites. | Study 1 (Single Lab): 100% agreement for all analytes (SPA, SCCmec, mecA, Binary Toxin, Toxin B, TcdC, Van A) across all Amplichek II levels (Negative, Level 1, Level 2, Level 3) for both product lots (Lot #1 and Lot #2), resulting in 100% total agreement by samples. (e.g., 80/80 agreements) |
Consistency in Cycle Threshold (Ct) values and low Coefficient of Variation (CV%) across different lots, operators, days, and sites for positive controls. | Study 2 (Three Labs): Ct Mean and Ct %CV provided for positive controls across all analytes and levels for both Lot #1 and Lot #2. The Ct %CV values are generally low, mostly ranging from 1.4% to 9.1%, indicating good consistency. Note that Negative and some Level 1 controls had "N/A" for Ct values as they were expected to be negative. |
Stability of the material until the expiration date when stored correctly. | An accelerated stability study was performed to establish shelf-life stability claims. Real-time stability studies are ongoing to support product claims. The protocols and acceptance criteria for these studies were reviewed and found acceptable. |
The labeling is sufficient and satisfies regulatory requirements. | The labeling was deemed sufficient and satisfies the requirements of 21 CFR Parts 801 and 809 and the special controls for this device type. |
2. Sample Size Used for the Test Set and Data Provenance
The document describes two reproducibility studies acting as the test sets:
-
Study 1 (Single Lab):
- Sample Size: For each gene/analyte at each Amplichek II level (Negative, Level 1, Level 2, Level 3), there were 40 tests per lot (2 replicates per run x 2 runs per day x 10 days). With two lots, this totals 80 tests per gene/analyte/level. For instance, for the Cepheid Xpert SA Nasal Complete (MSSA/MRSA), there were 80 tests for SPA, 80 for SCCmec, and 80 for mecA at the Negative level, and similarly for Level 1, Level 2, and Level 3.
- Data Provenance: The data was generated through prospective testing at one laboratory site in the United States (implied by the FDA De Novo application and regulatory context). The study incorporated different operators and different days.
-
Study 2 (Three Labs):
- Sample Size: For each gene/analyte at each Amplichek II level, there were 9 tests per lot (different operators x 3 different days). With two lots, this totals 18 tests per gene/analyte/level.
- Data Provenance: The data was generated through prospective testing at three laboratory sites in the United States (implied by the FDA De Novo application and regulatory context). The study incorporated different operators and different days.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications
The Amplichek II is a quality control material intended to monitor the performance of in vitro laboratory nucleic acid testing. The "ground truth" for the test results (i.e., whether a control should be positive or negative for a specific analyte and its approximate Ct value range) is inherent to the design and composition of the Amplichek II material itself. It is not established by human experts in the way clinical diagnostic ground truth might be (e.g., through pathologist review or clinical outcomes).
Instead, the expected results (positive/negative and Ct range) are predetermined by Bio-Rad Laboratories based on their characterization of the control material's content and concentration of purified intact microorganisms. The studies then verify that the Cepheid GeneXpert Systems accurately detect these known compositions.
Therefore, no external experts were used to establish the ground truth for the test set; the ground truth is defined by the manufacturer's formulation.
4. Adjudication Method for the Test Set
Since the ground truth is defined by the known composition of the control material, and the output is a qualitative (positive/negative) or semi-quantitative (Ct value) result from a machine, there is no adjudication method involving human experts for the test set. The device performance is a direct comparison of the instrument's output against the expected result of the control material.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done
No, a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was not done. This type of study (comparing human readers with and without AI assistance on various cases) is typically applied to diagnostic imaging interpretation or other scenarios where human interpretation is the primary method being evaluated.
The Amplichek II is a quality control material for an automated molecular diagnostic system (Cepheid GeneXpert Systems). Its function is to verify the correct operation of these automated systems, not to assist human readers in interpreting complex cases. Therefore, the concept of human readers improving with or without AI assistance does not apply here.
6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done
The device itself, Amplichek II, is a physical quality control material, not an algorithm. However, the performance of the Cepheid GeneXpert system when using the Amplichek II control material is effectively a standalone performance evaluation of the GeneXpert system's ability to correctly process and interpret the known control substance. The studies assessed the Cepheid GeneXpert system's output (positive/negative and Ct values) without human intervention in the interpretation of the control material results. The Amplichek II simply serves as the standardized input.
Given that the Amplichek II is designed for automated molecular diagnostic systems, the evaluation of its performance (meaning how effectively it allows monitoring of an automated system) inherently relied on the standalone performance of the GeneXpert system.
7. The Type of Ground Truth Used
The ground truth used is based on the known, pre-defined composition and concentration of microorganisms within the Amplichek II control material. Bio-Rad Laboratories engineers and develops the control material to contain specific analytes (e.g., MRSA, MSSA, Cdiff, VRE) at defined levels (Negative, Level 1, Level 2, Level 3).
Therefore, the expected result for each test (e.g., "Positive" for SCCmec at Level 2, "Negative" for Toxin B at Level 1) is a known characteristic of the manufactured control. The studies verify that the Cepheid GeneXpert Systems produce results that agree with these known characteristics.
8. The Sample Size for the Training Set
The document does not explicitly describe a separate training set for the Amplichek II or the Cepheid GeneXpert System in the context of this submission.
The Amplichek II is a quality control material, not an algorithm that requires a training set. The Cepheid GeneXpert Systems are the devices being monitored by the Amplichek II. The GeneXpert systems themselves would have undergone extensive validation and potentially "training" during their development, but this information is not detailed in the Amplichek II's de novo submission.
The studies described are performance evaluations of the control material and how well it helps monitor the performance of the GeneXpert systems.
9. How the Ground Truth for the Training Set Was Established
As noted above, there is no explicit "training set" for the Amplichek II control material itself. The ground truth for the performance evaluation (test set) is the known, manufactured composition of the control material. This is established through the Bio-Rad Laboratories' internal manufacturing processes, formulation, and characterization of the specific microorganisms and their target concentrations within each level of the Amplichek II product.
Ask a specific question about this device
(115 days)
Post-NSE
NOVA View® Automated Fluorescence Microscope is an automated system consisting of a fluorescence microscope and software that acquires, stores and displays digital images of stained indirect immunofluorescent slides. It is intended as an aid in the detection and classification of certain antibodies by indirect immunofluorescence technology. The device can only be used with cleared or approved in vitro diagnostic assays that are indicated for use with the device. A trained operator must confirm results generated with the device.
NOVA View® is an automated fluorescence microscope. The instrument does not process samples. The instrument acquires digital images of representative areas of indirect immunofluorescent slides.
Hardware components:
- PC and monitor
- Keyboard and mouse ●
- Microscope
- Microscope control unit
- Slide stage
- LED illumination units
- Handheld LED display unit ●
- Camera ●
- Two fans ●
- Printer (optional) ●
- UPS (optional) or surge protector
- Handheld barcode scanner (optional) ●
1. Acceptance Criteria and Reported Device Performance
The device under evaluation is the NOVA View® Automated Fluorescence Microscope and its performance for the NOVA Lite® DAPI ANA Kit, as presented in the accuracy and reproducibility studies. The acceptance criteria are implicitly derived from the comparisons to manual reading (the reference standard) and digital reading by a human operator, with targets often expressed as agreement percentages or consistent classification and pattern recognition. The accuracy study assesses sensitivity and specificity, while the reproducibility study examines agreement within and between sites and operators.
Here's a summary of the reported device performance against these implicit acceptance criteria, focusing on key metrics from the provided text:
Acceptance Criteria Category | Specific Metric | Acceptance Criteria (Implicit from context) | Reported Device Performance |
---|---|---|---|
Accuracy (Detection) | Agreement between NOVA View® and Manual reading for Positive/Negative classification | High agreement (e.g., >80-90%) for positive and negative classifications, indicating that the NOVA View® system's automated calls align well with human expert interpretation. | Site #1: Positive Agreement: 88.3% (82.5-92.7), Negative Agreement: 90.4% (86.4-93.5), Total Agreement: 89.6% (86.5-92.3). |
Site #2: Positive Agreement: 80.5% (74.2-85.9), Negative Agreement: 96.3% (93.4-98.2), Total Agreement: 89.8% (86.7-92.4). | |||
Site #3: Positive Agreement: 86.1% (80.7-90.5), Negative Agreement: 87.8% (83.1-91.6), Total Agreement: 87.0% (83.6-90.0). | |||
Sensitivity for specific disease conditions (e.g., SLE) | The device's performance (NOVA View® and Digital Read) should be comparable to or ideally better than Manual Read. | Site #1: Manual: 72.0% (SLE), 62.9% (CTD+AIL); Digital: 80.0% (SLE), 69.9% (CTD+AIL); NOVA View®: 80.0% (SLE), 69.4% (CTD+AIL). | |
Site #2: Manual: 70.7% (SLE), 65.6% (CTD+AIL); Digital: 73.3% (SLE), 62.9% (CTD+AIL); NOVA View®: 72.0% (SLE), 62.9% (CTD+AIL). | |||
Site #3: Manual: 82.7% (SLE), 71.0% (CTD+AIL); Digital: 81.3% (SLE), 69.4% (CTD+AIL); NOVA View®: 82.7% (SLE), 72.0% (CTD+AIL). | |||
Specificity (excluding healthy subjects) | The device's performance (NOVA View® and Digital Read) should be comparable to or ideally better than Manual Read. | Site #1: Manual: 74.1%; Digital: 72.4%; NOVA View®: 75.3%. | |
Site #2: Manual: 67.2%; Digital: 75.3%; NOVA View®: 77.0%. | |||
Site #3: Manual: 67.2%; Digital: 71.3%; NOVA View®: 69.0%. | |||
Accuracy (Pattern Recognition) | Agreement between NOVA View® and Manual for Pattern Identification | High agreement (e.g., >70% for definitive patterns) for pattern recognition, indicating that the automated system can accurately classify patterns as interpreted by human experts. | Accuracy Study: Site #1: 76.0%; Site #2: 86.3%; Site #3: 72.7%. |
Reproducibility Study: Site #1: 78.9%; Site #2: 83.3%; Site #3: 80.4%. | |||
Precision/Reproducibility | Repeatability (internal consistency) - Positive/Negative Classification | High consistency (e.g., >95%) for samples not near the cut-off. | For samples away from the cut-off, NOVA View® output showed 100% positive or negative classification. For samples near the cut-off, variability was observed. |
Repeatability (internal consistency) - Pattern Consistency | 100% consistency for pattern determination in positive samples. | Pattern determination was consistent for 100% of replicates for positive samples (digital image reading and manual reading). NOVA View® pattern classification was correct for >80% of cases (excluding unrecognized). | |
Within-Site Reproducibility (Operator and Method Agreement) | High total agreement (e.g., >90-95%) between operators and different reading methods within a site. | Site #1: NOVA View® vs Manual: 99.2%; Digital vs Manual: 99.2%; Digital vs NOVA View®: 100.0%. | |
Site #2: NOVA View® vs Manual: 96.7%; Digital vs Manual: 95.8%; Digital vs NOVA View®: 95.8%. | |||
Site #3: NOVA View® vs Manual: 96.7%; Digital vs Manual: 96.7%; Digital vs NOVA View®: 98.3%. | |||
Between-Site Reproducibility (Method Agreement across sites) | High overall agreement (e.g., >90-95%) across different sites for all reading methods. | Manual: Site #1 vs #2: 90.7%; Site #1 vs #3: 85.7%; Site #2 vs #3: 87.3%. | |
Digital: Site #1 vs #2: 92.0%; Site #1 vs #3: 93.1%; Site #2 vs #3: 92.0%. | |||
NOVA View®: Site #1 vs #2: 92.7%; Site #1 vs #3: 89.6%; Site #2 vs #3: 87.9%. | |||
Single Well Titer (SWT) | SWT accuracy compared to Manual and Digital endpoints | High agreement, with estimated titer within ±1 or ±2 dilution steps. | SWT results were within ±2 dilution steps from manual endpoint for 96% (48/50) of samples and from digital endpoint for 98% (49/50) of samples in the initial validation. In the clinical study, it was within ±2 dilution steps for all 20 samples at all three locations. |
2. Sample Sizes and Data Provenance
- Test Set (Accuracy Study): 463 clinically characterized samples.
- Data Provenance: The study was conducted retrospectively or prospectively, and at three different locations: one internal (Site 1) and two external (Sites 2 and 3). The countries of origin of the data are not explicitly stated, but the mention of "U.S. sites" in special controls (2)(ii)(B) suggests primary relevance to the US context.
- Test Set (Reproducibility Study): 120 samples per location (total of 360 unique or overlapping samples if shared across sites as described).
- Data Provenance: Conducted at Inova Diagnostics (internal; Site#1) and two external sites (Sites #2 and #3). The same cohort of samples was processed at each location.
- Test Set (Repeatability Study 1): 13 samples (3 negative, 10 positive), tested in triplicate across 10 runs (30 data points per sample).
- Test Set (Repeatability Study 2): 22 samples (20 borderline/cut-off, 2 high intensity), tested in triplicate across 10 runs (30 data points per sample).
- Test Set (Repeatability Study 3): 8 samples, tested in triplicate or duplicate across 5 runs (10-15 data points per sample).
- SWT Validation Study 1: 50 ANA positive samples.
- SWT Validation Study 2: 20 ANA positive samples at each of the three locations (total 60 data points, if unique).
3. Number of Experts and Qualifications for Ground Truth
- Accuracy Study, Reproducibility Study, and SWT Validation: For "Manual" reading (the reference standard), "trained human operators" performed the interpretations. For "Digital" reading, "trained human operators" interpreted the software-generated images, blinded to automated results.
- Qualifications: The document consistently refers to "trained operators" and "trained human operators." Specific professional qualifications (e.g., "radiologist with 10 years of experience") are not explicitly provided, but the context implies experienced clinical laboratory personnel proficient in indirect immunofluorescence microscopy.
4. Adjudication Method for the Test Set
- Accuracy Study: Not explicitly stated as a formal adjudication. The comparison was described as a "three-way method comparison of NOVA View® automated software-driven result (NOVA View®) compared to the Digital image reading... by a trained operator who was blinded to the automated result (Digital) and compared to the reference standard of conventional IIF manual microcopy (Manual)." The "Manual" reading served as the key reference. Clinical truth for sensitivity/specificity was determined independently from the three reading methods.
- Reproducibility Study: No formal adjudication process is detailed between the different reading methods or operators. Agreement simply refers to concordance between the specified interpretations. For between-operator agreement, multiple operators at each site interpreted the same digital images.
- SWT Validation Studies: The "manual endpoint" determined by an operator using a traditional microscope served as a primary reference for comparison with the SWT application's endpoint.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- Yes, a form of MRMC was conducted for reproducibility. The reproducibility study involved multiple operators (referred to as "Operator #1" and "Operator #2" at each site) interpreting the same digital images and comparing their results.
- Effect Size of Human Readers with AI vs. without AI: The document does not provide a direct "effect size" in terms of how much human readers improve with AI assistance versus without. Instead, it measures the agreement between manual reading (without AI assistance, as the traditional method) and digital image reading (human-in-the-loop with AI-provided images).
- For example, in the Accuracy Study, comparing "Digital vs. Manual" total agreement was: Site #1 (91.4%), Site #2 (92.2%), Site #3 (92.2%). This indicates a high level of agreement between human interpretation of digital images (AI-assisted display) and manual microscopy, suggesting that the digital images are comparable to traditional microscopy.
- Furthermore, "Between Operator Agreement" for digital image reading showed very high agreement (e.g., 99.2% for Site #1 Op #1 vs. Site #1 Op #2), indicating consistency among human readers using the digital system.
6. Standalone (Algorithm Only) Performance
- Yes, a standalone performance was done for various aspects.
- The "NOVA View®" results explicitly refer to "results obtained with the NOVA View® Automated Fluorescence Microscope, such as Light Intensity Units (LIU), positive/negative classification and pattern information without operator interpretation." This represents the algorithm's standalone output before human review.
- The accuracy and reproducibility tables compare "NOVA View®" (standalone algorithm) directly against "Manual" reading (reference standard) and "Digital" reading (human-in-the-loop).
7. Type of Ground Truth Used
- Expert Consensus / Clinical Diagnosis / Reference Standard:
- For the Accuracy Study, the "Manual" reading by trained operators using a traditional fluorescence microscope served as the primary reference standard for comparing the digital and automated methods. Additionally, clinical sensitivity and specificity were determined by comparing the results from all three methods (Manual, Digital, NOVA View®) to a "clinical truth" derived from a "cohort of clinically characterized samples." This clinical truth would likely be established through a combination of clinical criteria and other diagnostic tests, representing a form of expert consensus or outcomes data.
- For the Reproducibility/Repeatability Studies, the "Manual" reading served as the reference standard for evaluating consistency.
- For the SWT Validation, the "manual endpoint" titer determined by trained operators using traditional microscopy was the reference.
8. Sample Size for the Training Set
- The document does not explicitly state the sample size used for training the NOVA View® algorithm. The studies described are performance evaluations (test sets) rather than detailing the algorithm's development or training data.
- For the Single Well Titer (SWT) function, it states that "The NOVA View® SWT function was established [using] 38 ANA positive samples," which could be considered a form of "calibration" or establishment data for that specific algorithm feature, rather than a general training set for the primary classification.
9. How the Ground Truth for the Training Set Was Established
- Since the training set size is not provided, the method for establishing its ground truth is also not detailed in this document.
- For the SWT function's establishment data (38 ANA positive samples), the text implies that the "software application automatically performs the calculations based on the predetermined dilution curve, the LIU produced by the sample, and the pattern of the ANA." This suggests these 38 samples were used to define or fine-tune this "predetermined dilution curve" and pattern-based calculations, likely referencing expert-determined ANA patterns and traditional titration results for those samples.
Ask a specific question about this device
(591 days)
Post-NSE
The VasoPrep Surgical Marking Pen is intended for use prior to or during the harvesting and preparation of vein grafts used in bypass surgery. The pen is used to demarcate selected sites and orientation of the graft.
The VasoPrep Surgical Marking Pen is a single patient use sterile prescription use only marker intended for use on veins prior to or during Coronary-Assisted Bypass Graft (CABG) surgery. The marker (Figure 1) consists of a pen body, barrel, wick and cap with a wide chisel style applicator tip for delivery of ink to mark internal tissue. The formulation is non-toxic as used and is comprised of an ink material (FD&C Blue Dye #1) compounded into a carrier material (i.e., solvent). The wide chisel tip can deliver either a thin line of ink for precise marks or can be rotated 90° to deliver a wide stripe of ink.
This document describes a De Novo classification request for the VasoPrep Surgical Marking Pen, an internal tissue marker. The studies presented are non-clinical/bench studies, therefore, the concept of AI performance, ground truth establishment by experts, adjudication methods, multi-reader multi-case studies, and a standalone algorithm performance are not applicable.
1. Table of Acceptance Criteria and the Reported Device Performance:
The acceptance criteria and reported device performance are presented across several tables in the provided text. For clarity, they are consolidated and summarized below:
Test Category | Test Name | Purpose | Acceptance Criteria | Reported Device Performance |
---|---|---|---|---|
Biocompatibility | Cytotoxicity (ISO 10993-5) | To test and evaluate the cytotoxicity of the marker and ink formulation. | Non-cytotoxic | Non-cytotoxic |
Sensitization - Maximization Method (guinea pig) (ISO 10993-10) | To test and evaluate the potential for the marker and ink formulation to cause delayed contact sensitization. | No evidence of causing delayed dermal contact sensitization | No evidence of causing delayed dermal contact sensitization | |
USP Intracutaneous Reactivity (ISO 10993-10) | To test and evaluate the potential for the marker and ink formulation to cause local dermal irritant effects. | Nonirritant | Nonirritant | |
USP Systemic Toxicity | To test and evaluate the acute system toxicity of the marker and ink formulation. | No indications of systemic toxicity | No indications of systemic toxicity | |
Hemocompatibility (In vitro hemolysis) | To test and evaluate the hemocompatibility of the marker and ink formulation. | No significant hemolysis | No significant hemolysis | |
Material Mediated Pyrogenicity (ISO 10993-9) | To test and evaluate the pyrogenicity of the marker and ink formulation. | Non-pyrogenic | Non-pyrogenic | |
Shelf Life/Sterility/Packaging | Sterility Testing (ISO 11137) | To test and evaluate the sterility of the marker and ink formulation. | To ensure gamma radiation sterilization process is an adequate dose. Devices must have a sterility assurance of at least 10^-6. | Meets Acceptance Criteria |
Packaging Integrity (ASTM F1886/F1886M; ASTM F1929; ASTM F88/88M) | To test and evaluate the marking ability after undergoing accelerated aging and mechanical stress. | The packaging must pass the Visual Seal Examinations; Dye Leak Test; and Peel Test. | Meets Acceptance Criteria | |
Performance Testing – Bench | Internal Tissue Marking Ability | To test and evaluate device ability to mark human saphenous veins (HSV). | The marker shall provide a visible mark on wet or dry tissue that is 1-3 mm wide and up to 90 cm long with a single swipe. The mark shall remain visible on tissue for at least 4 hours. | Meets Acceptance Criteria (The marker provided a visible mark on wet or dry tissue that is 1-3 mm wide and up to 90 cm long with a single swipe. The mark remained visible on tissue for at least 4 hours.) |
Effect of Dye on Human Vein Tissue | To test and evaluate for patency effects caused by the ink on HSV. | Ex-vivo exposure to ink shall have no detrimental effect on the viability, smooth muscle contractility and endothelial-dependent relaxation of human saphenous vein grafts. | Meets Acceptance Criteria (The VasoPrep Surgical Marking Pen met all design requirements for compatibility and functional use.) | |
Effect of Dye on Animal Vein Tissue | Preliminary dosing experiments to test and evaluate demarcation ability of the ink on porcine saphenous veins, at a dose that will have no detrimental effect on further HSV testing. | Application shall demonstrate ink demarcation ability on porcine saphenous veins at an amount that would have no detrimental effect on the viability, smooth muscle contractility and endothelial-dependent relaxation of human saphenous vein grafts. | Meets Acceptance Criteria | |
Product Stability | Functionality (ASTM F1980) | To test and evaluate the marking ability after undergoing real time aging and shipping stress. | The marker shall provide a visible mark on wet or dry tissue that is 1-3 mm wide and up to 90 cm long with a single swipe. The mark shall remain visible on tissue for at least 4 hours. | Meets Acceptance Criteria (sterilized aged marking pens functioned as designed) |
Cytotoxicity (ASTM F1980; ISO 10993-5) | To test and evaluate the cytotoxicity of the marker and ink formulation after undergoing accelerated aging. | The marker and contents shall be non-cytotoxic. | Meets Acceptance Criteria (ink is non-cytotoxic) |
2. Sample size used for the test set and the data provenance (e.g. country of origin of the data, retrospective or prospective):
- Human Saphenous Vein (HSV) segments: Used for "Internal Tissue Marking Ability" and "Effect of Dye on Human Vein Tissue" studies. The exact number of segments is not specified, but it refers to "segments" which implies a limited number for individual tests. The origin of the human tissue is not specified (e.g., country of origin). These are ex-vivo experiments (retrospective in the sense that the tissue has been harvested, but prospective for the purpose of the experiment itself).
- Porcine Saphenous Vein: Used for "Effect of Dye on Animal Vein Tissue" and product stability testing. The exact number of veins or samples is not specified. The origin is not specified. These are ex-vivo or animal model experiments.
- Animals for Biocompatibility (Guinea Pig): Used for Sensitization testing. A "guinea pig method" is mentioned, implying a standard number of animals for such a test (typically a small cohort).
- Sterilized and Aged Markers: Four markers were tested for product stability functionality, and two real-time aged markers were used for demarcation effectiveness and fluid volume.
- Toxicological Assessment Literature Studies: Canine, porcine, and murine models were used in literature studies for lifetime toxicity/carcinogenicity of FD&C Blue #1 dye. The number of animals in these literature studies is not specified in this document.
3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts (e.g. radiologist with 10 years of experience):
Not applicable. This is a non-clinical/bench study. There is no mention of experts establishing a "ground truth" in the context of human interpretation or diagnostic accuracy. Performance criteria are based on objective physical, chemical, and biological measurements.
4. Adjudication method (e.g. 2+1, 3+1, none) for the test set:
Not applicable. This is a non-clinical/bench study. Adjudication methods are typically used in studies involving subjective human interpretation, which is not the case here.
5. If a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:
Not applicable. This is a non-clinical/bench study for a physical device (marking pen), not an AI-powered diagnostic tool.
6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:
Not applicable. This is a non-clinical/bench study for a physical device, not an algorithm.
7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.):
The "ground truth" in these studies is based on objective scientific measurements and established standards for biocompatibility (e.g., non-cytotoxic, non-irritant), sterility (sterility assurance level), packaging integrity (visual, dye leak, peel tests), and performance characteristics (visible mark, no detrimental effect on tissue viability). It's a scientific and technical "ground truth" established through laboratory testing.
8. The sample size for the training set:
Not applicable. There is no AI or machine learning component, therefore no training set.
9. How the ground truth for the training set was established:
Not applicable. There is no AI or machine learning component, therefore no training set or ground truth for it. Any "training" or optimization of the pen's design or ink formulation would have been part of the product development process, not a formal training set as understood in AI/ML contexts.
Ask a specific question about this device
Page 1 of 11