Unidentified Emissions Liens List urn:nasa:pds:compil-comet:unid-emis::1.0 Certification status: Not certified. Delta review is sufficient. Overview document is essential. Data Provider Liens =================== Tilden Pre-Review file: collection.xml --> citation needs to be updated. It looks like a pds3 ds_desc. --> Is there an editor that should be listed for this collection? Mike Kelley Review data/ --> The heliocentric distance is a relevant parameter that is missing from the files. Documentation: --> Please define Reference code (tell the user that this should be ignored), Information bulletin, etc. --> Please describe the method for the literature search and the criteria used to determine if a line is real and unidentified. Include target selection and that molecular bands are also included (e.g., Wyckoff et al. 1999). This is important for understanding the data completeness and level of fidelity. For example, the PDS archived atlas of Cochran et al. 2002 lists 4055 unidentified lines for comet 122P/de Vico between 3830 and 10192 , but this data set lists 242 lines between 3862 and 4187. Without documentation a user cannot determine the origin of the discrepancy. The references searched span a wide range of time. Documentation would also help the user understand if the unidentified lines are simply those that were unidentified at time, or if any currently understood lines have been removed from the lists. --> Why is SNR N/A for most? Signal-to-noise ratio should always be applicable in this context. If it is unknown, state that instead, and describe in the documentation why some are unknown. --> Similar to SNR: Why are some resolving powers == N/A? Resolving power should always be applicable to these spectra. If it is unknown, state that instead, and describe in the documentation why they are unknown. Can no estimate be made? R~1000 and ~100,000 are very different and indicate the difference between an unidentified line and an unidentified band. Lori Feaga Review file: collection.xml What is the Data Set Overview saying about format in the collection.xml? "There are two different formats used for the tables. The first format is to separate all information into two tables; data and observation. This is done when the observation information cannot be paired with specific data. The second format is to have a single table when the information can be paired with the data." --> Only saw one format for the tables, one table per comet, tables each have one column of wavelengths of unidentified lines. files: 23pbrorsen_metcalf_1, 109pswift_tuttle_1, 153pikeya_zhang 3 of the label files (comets 23P, 109P, and 153P) have a but do not have a . They have instead. Why not if they have start times and exposure lengths given in the reference papers (23P and 109p in particular)? --> Add stop times to these 3 labels. file: 21pgiacobini_zinner_1 and likely incorrect for 21P because reference is dated 1992, but start_date_time is 2001. --> Correct these values file: 122pde_vico_1 122P/deVico not only has a paper, but also a PDS3 data set that should/could be referenced --> Add reference data/ How are the unidentified lines down selected? 122P/deVico has 4055 unidentified lines. Only 242 lines are included in this collection. --> If there is a selection rule or criteria, please state in documentation Chen EN Notes *.csv --> Floating point numbers should have a decimal. Some values in this file don't. Section 4B.1.2 of the Standards says: f indicates a floating point number in the format [-]ddd.ddd --> Many value do not have trailing 0's, e.g. 4482.5 for %7.2f . Section 4B.1.2 of the Standards says about the precision: For real numbers, it indicates the number of digits to the right of the decimal point. SBN Liens ========= Tilden Pre-Review file: collection.xml --> Use Capital/Title case the entry --> version in the <title> is not required or necessary. --> perhaps rename <title> to be "Catalog of Unidentified Cometary Emission Lines" file: collection_compil-comet_data_inventory.csv --> Suggest renaming this file to inventory.csv. Remember to update the collection.xml file as well. Mike Kelley Review files: *.xml --> Some of the line lists are in the UV and near-IR. Should the collection also include these wavelength ranges in the Primary_Result_Summary? If so, then the individual labels should include their own wavelength_range to facilitate searching. --> Are PDS4 units case sensitive? angstrom vs. Angstrom --> Are XML entities OK? & found in the labels (within literature references). Lori Review Files: *.xml --> In <File_Area_Observational> area, the header <object_length> and table <offset> are the same values in 19pborrelly_1.xml and 9 bytes different in 109pswift_tuttle_1.xml. Should be different by 1 in both cases?? File: 153pikeya_zhang --> For comet cross id consistency, all periodic comets are "_1" except for 153P. Add the "_1" to 153P files File: 19pborrelly_1.xml --> In reference_text "Poceedings" should be "Proceedings" File: 153pikeya_zhang.xml --> In reference_text "De Santes" should be "de Sanctis" Chen EN Notes Files: *.xml --> In the XML Prolog and Root Tag (the top of a label), use https instead of http for URLs (not namespaces) of NASA sites, e.g. change <?xml-model href="http://pds.nasa.gov/pds4/pds/v1/PDS4_PDS_1900.sch" to <?xml-model href="https://pds.nasa.gov/pds4/pds/v1/PDS4_PDS_1900.sch" and xsi:schemaLocation=" http://pds.nasa.gov/pds4/pds/v1 http://pds.nasa.gov/pds4/pds/v1/PDS4_PDS_1900.xsd to xsi:schemaLocation=" http://pds.nasa.gov/pds4/pds/v1 https://pds.nasa.gov/pds4/pds/v1/PDS4_PDS_1900.xsd --> EN recommends (SBN has always disagreed) explicitly declaring xmlns:pds="http://pds.nasa.gov/pds4/pds/v1" --> EN recommends including context and schema collections for a fully self-describing bundle. --> The offset for Table_Delimited is low by 1, e.g. for 122pde_vico_1.xml, <offset unit="byte">869</offset> should be <offset unit="byte">870</offset> The header's object_length should be the same, e.g. 870 for 122pde_vico_1.xml.