This Firpo_readme20230118.txt file was generated on 20230118 by Matthew A. Firpo, and edited by Brandon Patterson ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset Data and Code for: Firpo et al., A multi-analyte serum biomarker panel for early detection of pancreatic adenocarcinoma. 2. Author Information Principal Investigator Contact Information Name: Matthew A. Firpo Institution: University of Utah Address: 30 N 1900 E, Salt Lake City, UT 84132 Email: matt.firpo@hsc.utah.edu Associate or Co-investigator Contact Information Name: Institution: Address: Email: Alternate Contact Information Name: Institution: Address: Email: 3. Date of data collection (single date, range, approximate date) 20050101-20190131 4. Geographic location of data collection (where was data collected?): Salt Lake City Utah 5. Information about funding sources that supported the collection of the data: Supported by research grants from the National Institutes of Health (CA115225, CA151650, CA155586, CA196403, CA200468 and P30CA042014 to the Huntsman Cancer Institute for support of core facilities), grants from the Huntsman Cancer Institute Gastrointestinal Cancer Research Program and through support from the Huntsman Cancer Foundation. -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: N/A 2. Links to publications that cite or use the data: medRxiv 2022.03.03.22271867; doi: https://doi.org/10.1101/2022.03.03.22271867 [JCO CCI publication,when available] 3. Links to other publicly accessible locations of the data: N/A 4. Links/relationships to ancillary data sets: N/A 5. Was data derived from another source? If yes, list source(s): No 6. Recommended citation for the data: [JCO CCI publication,when available] --------------------- DATA & FILE OVERVIEW --------------------- 1. File List A. Filename: Firpo et al. JCO CCI Data 1_6.txt Short description: Final raw data used for analysis in text. B. Filename: Firpo et al. JCO CCI R code.txt Short description: R code used in text. C. Filename: Short description: 2. Relationship between files: The R code gives the methodological steps of the machine learning analysis reported in the text. The data file can be imported into R (replacing "Firpo et al. JCO CCI R code.txt" for "FinalRawData1_6.txt"). Running the code will generate a similar analysis reported in the text. 3. Additional related data collected that was not included in the current data package: N/A 4. Are there multiple versions of the dataset? No If yes, list versions: Name of file that was updated: i. Why was the file updated? ii. When was the file updated? Name of file that was updated: i. Why was the file updated? ii. When was the file updated? -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: Specific methodology detailed in medRxiv 2022.03.03.22271867; doi: https://doi.org/10.1101/2022.03.03.22271867 and [JCO CCI publication,when available] 2. Methods for processing the data: descibed in text 3. Instrument- or software-specific information needed to interpret the data: Analysis originally performed using R version 3.6.1 R Core Team. 2019 R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. . 4. Standards and calibration information, if appropriate: descibed in text 5. Environmental/experimental conditions: descibed in text 6. Describe any quality-assurance procedures performed on the data: descibed in text 7. People involved with sample collection, processing, analysis and/or submission: Matthew A. Firpo, Kenneth M. Boucher, Josh Bleicher, Gayatri D. Khanderao, Alessandra Rosati, Katherine E. Poruk, Sama Kamal, Liberato Marzullo, Margot De Marco, Antonia Falco, Armando Genovese, Jessica M. Adler, Vincenzo De Laurenzi, Douglas G. Adler, Kajsa E. Affolter, Ignacio Garrido-Laguna, Courtney L. Scaife, M. Caterina Turco, Sean J. Mulvihill ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: [FILENAME] ----------------------------------------- Firpo et al. JCO CCI Data 1_6.txt 1. Number of variables: 40 2. Number of cases/rows: 1023 3. Variable List Variable Desription Units Split indicates whether data was used for training, test, or validation Set indicates in what set the sample assay was performed Temp_ID deidentified sample identifier IT_ID deidentified sample identifier Dx1 granular diagnostic classification Dx binary diagnostic classification Stage disease stage for cases Age subject age at sample collection Gender subject sex ALCAM analyte, ELISA level ng/ul ANG analyte, ELISA level ng/ul AXL analyte, ELISA level ng/ul BAG3 analyte, ELISA level ng/ul BSG analyte, ELISA level ng/ul CA19.9 analyte, ELISA level U/ul CEA analyte, ELISA level ng/ul CEACAM1 analyte, ELISA level ng/ul COL18A1 analyte, ELISA level ng/ul EPCAM analyte, ELISA level ng/ul HA analyte, ELISA level ng/ul HP analyte, ELISA level ug/ul ICAM1 analyte, ELISA level ng/ul IGFBP2 analyte, ELISA level ng/ul IGFBP4 analyte, ELISA level ng/ul LCN2 analyte, ELISA level ng/ul LRG1 analyte, ELISA level ng/ul MMP2 analyte, ELISA level ng/ul MMP7 analyte, ELISA level ng/ul MMP9 analyte, ELISA level ng/ul MSLN analyte, ELISA level ng/ul PARK7 analyte, ELISA level ng/ul PPBP analyte, ELISA level ng/ul PRG4 analyte, ELISA level ug/ul SPARCL1 analyte, ELISA level ng/ul OPN analyte, ELISA level ng/ul TGFBI analyte, ELISA level ng/ul THBS1 analyte, ELISA level ng/ul TIMP1 analyte, ELISA level ng/ul TNFRSF1A analyte, ELISA level ng/ul VEGFC analyte, ELISA level ng/ul 4. Missing data codes: Code/symbol Definition Code/symbol Definition 5. Specialized formats of other abbreviations used N/A