ࡱ> IKH &bjbj++ .6AA<<         ,L L20`vvvvv\1111111,|3.61 vv1  vv2 v v11,-njC-120L2Y-^66,--0 -11L26< :   6. OBIS DATA QUALITY ASSURANCE [DRAFT Working document 17 November 2012] OBIS NODES MANUAL FOR IMPROVED QUALITY MANAGEMENT, STANDARDS AND BEST PRACTICES (TAXONOMIC AND GEOGRAPHICAL QC/DATA VALIDATION PROCEDURES) The preparation of a node handbook describing iOBIS/Node operations was listed as high priority at the OBIS-SG-1 and as a result a document was prepared by Leen Vandepitte, Francisco Hernandez, (EurOBIS), Mary Kennedy (OBIS Canada), Bruno Danis (AntOBIS). This draft was circulated to all node managers for comments in April 2012 and was posted online as part of the background documents for the Second OBIS Technical meeting in June 2012.  HYPERLINK "http://www.iode.org/index.php?option=com_oe&task=viewDocumentRecord&docID=9174" \t "_blank" http://www.iode.org/index.php?option=com_oe&task=viewDocumentRecord&docID=9174 The aim is to make the final document dynamic and promote revision on a regular basis. This manual should be seen as a guide for the OBIS community, created by the OBIS community. It is the intent that this document be a result of the combined effort of all node managers, to come to a good understanding of how data for OBIS should be handled and that each node follows the same procedures and guidelines before submitting the data to OBIS to ensure the best possible quality of the data. Suggestions for additions and changes to this document can be made at any time. The document is meant to be a starting point to reach a standardized methodology for nodes to process submitted datasets. The objective of the document is to provide an overview on the best practices expected from the different node managers, to come to a common understanding on the level of data quality the OBIS community holds up to. This manual describes a set of norms that should be followed as much as possible. Feedback from the SG on the draft version will be incorporated before the manual is officially released to the general public. An overview of the manual created by the Data Quality Working Group was presented by Leen Vandepitte. Each of the manuals five sections/parts was described and SG members were encouraged to ask questions at the end of each section and a list of actions for each section was compiled. Part 1 of the document deals with the metadata associated with a dataset and stresses the fact that the importance of good metadata should not be underestimated. Metadata can capture information that does not fit within the OBIS Schema, but has its importance for the users of the data. Metadata should include a description of the dataset with linkages to associated documentation that will facilitate proper interpretation of the data. It should also aid online users decide if they wish to download the dataset determine fitness for use for different purposes. Terms of use agreements promote proper citation of datasets. The only method that OBIS has to provide its users with this information is through the metadata, so it is imperative that these pages contain the basic required content. It should also be noted that it is not just the data provider that should be cited credit should also be attributed to the various organizations in the data flow path, especially if the source dataset has been enhanced at any of the stages along the way to its final home in OBIS or GBIF. The presentation provided a few statistics related to metadata content currently in OBIS pointing out that many existing datasets currently are not associated with any citation, contact or abstract information. A number of best practices to deal with metadata and how metadata can be completed if very little information is provided by the data custodian are described in the document. The OBIS database contains tables that house the metadata information. These tables should be adapted to an adopted standard. Currently a variety of systems are being used by different nodes and the introduction of the IPT will introduce yet another method. Perhaps this topic will be covered by another working group? Part 2 of the document focuses on the data itself. Several fields of the OBIS Schema are discussed in detail: what kind of information belongs in these fields and how should the data be formatted to get to a standardized dataset before submitting it to OBIS. Different scenarios are dealt with, all based on actual situations node managers have already encountered. For each field, a number of best practices are formulated, helping the node managers to accurately deal with the submitted data. OBIS has 3 required fields, the scientific name and the geographic position (latitude and longitude). One of the QC procedures recommended in the manual is to map dataset scientific names to a standard such as the World Register of Marine Species (WoRMS). Various tools to facilitate this process were described. There are many issues related to datasets that include records where scientific names cannot be easily mapped to an accepted name. Similar issues exist with geographic positions. Are there regional standards or gazetteers for place names? How should we deal with the precision of a sampling location? There is a need to create controlled vocabularies for several OBIS schema fields such as life history stage and gender (sex). The first step will be to review existing content and then propose a set of terms and associated definitions. This vocabulary development should be in collaboration with GE-BICH. Feedback from the group on the subject of data was solicited. What improvements or revisions should be made to the manual? Have other nodes come across weird or problem cases that they wish to discuss? Part 3 of the document lists a number of possible statistics that can be run on the node data. These are examples from EurOBIS and AntOBIS/SCAR-MarBIN, but could be extended to all nodes, making for example a uniform reporting to OBIS and the scientific community possible. Running these statistics on regular times will help in visualizing the progress of each node and can also help in identifying gaps. Several types of statistics currently cannot be generated. Examples include: it is not possible to determine the percentage of data associated with university data providers as opposed to government data providers. However if an institution code table was created and each institute was assigned to a group then this type of statistic could be easily generated. can determine the percentage composition of content according to classification hierarchy (kingdom/phylum/etc) but cannot show the percentage of groundfish or plankton or other groups.. The SG was asked to provide examples of types of statistics that would be useful to their communities. Part 4 in the document lists a number of possible improvements and changes that can be made to the OBIS Schema to be able to better deal with certain data and information. These are merely suggestions based on experiences of node managers and should be discussed with the wider OBIS community. It is intended to regularly update this section, as node managers might come up with more issues or when issues are being tackled and solutions are formulated. Part 5 in the document lists some problematic situations, in which it is not clear which procedure should be followed to add the data to the OBIS Schema. Each of the items listed here should be open for discussion with the OBIS community, to come to a consensus on how to deal with it. As for part 4, this will be a very dynamic part: more issues will be added and after discussion and agreement most of these issues will fit under part 1 or 2. At the end of the presentation the SG members were asked to provide comments and a list of action items related to the node manual were compiled.  KM, ] ^ s t { | $ : ; c h m t ͺͮyqyiyiyiqaiYhX8mH sH hN9mH sH h:ImH sH hJaImH sH hJaIhJaImH sH hJaIh5!mH sH hmH sH hJaIhmmH sH h!CJaJmH sH hsCJaJmH sH % *hP#hP#5CJ\aJmH sH hP#5CJ\aJmH sH "hP#hs5CJ\aJmH sH "hP#hP#5CJ\aJmH sH " LMN O 01gdMgd[gdMgd:gdmgdJaIgds & Fgds 7$8$H$gds K L N O u |  6 9 k   齲{s{hZL{D9hLPQhmmH sH hZlmH sH hFqhm\]mH sH hXSh:\]mH sH hXS\]mH sH h:mH sH h:\]mH sH hm\]mH sH h<\]mH sH hZl\]mH sH h5!\]mH sH hk>PhmmH sH h5!h5!mH sH #hX8h5!0J5CJaJmH sH hX8h5!CJaJmH sH jhX8h5!CJUaJhX8mH sH  $1pq1  >?BJQ[a !KBCDY^p|պ||||t|t|t|h}JmH sH h V mH sH hD>mH sH hMmH sH h[mH sH h:ImH sH hVhmH sH hVhVmH sH hVmH sH h~<mH sH hJaIhmH sH hmH sH hmH sH hmmH sH hLPQhmmH sH hZlmH sH -|\iX.Ucjs567>N!!\"d"t"0$@$%&~&𸰸РhmH sH hbW'mH sH h[mH sH h^=mH sH hEMmH sH himH sH h+vmH sH hAmH sH hMmH sH hm mH sH hVmH sH h V mH sH hZlmH sH h}JmH sH 456ij67 9!!!\"]"($)$%%&&gdsgdZlgdM & FgdbW'gd[gdigdM~&&&hmhLPQhmmH sH ,1h/ =!"#$% 666666666vvvvvvvvv666666>6666666666666666666666666666666666666666666666666hH6666666666666666666666666666666666666666666666666666666666666666666666666662 0@P`p2( 0@P`p 0@P`p 0@P`p 0@P`p 0@P`p 0@P`p8XV~_HmH nH sH tH @`@ sNormalCJ_HaJmH sH tH n@n m Heading 1 x@&75B*CJ$OJPJQJ\^JaJ mHnHphsHtHV@V 5! Heading 3$<@&5CJOJQJ\^JaJDA`D Default Paragraph FontRiR  Table Normal4 l4a (k (No List VoV sDefault 7$8$H$!B*CJ_HaJmH phsH tH 6U@6 5! Hyperlink >*B*phFVF 5!FollowedHyperlink >*B*ph`d PK!pO[Content_Types].xmlj0Eжr(΢]yl#!MB;.n̨̽\A1&ҫ QWKvUbOX#&1`RT9<l#$>r `С-;c=1g~'}xPiB$IO1Êk9IcLHY<;*v7'aE\h>=^,*8q;^*4?Wq{nԉogAߤ>8f2*<")QHxK |]Zz)ӁMSm@\&>!7;wP3[EBU`1OC5VD Xa?p S4[NS28;Y[꫙,T1|n;+/ʕj\\,E:! t4.T̡ e1 }; [z^pl@ok0e g@GGHPXNT,مde|*YdT\Y䀰+(T7$ow2缂#G֛ʥ?q NK-/M,WgxFV/FQⷶO&ecx\QLW@H!+{[|{!KAi `cm2iU|Y+ ި [[vxrNE3pmR =Y04,!&0+WC܃@oOS2'Sٮ05$ɤ]pm3Ft GɄ-!y"ӉV . `עv,O.%вKasSƭvMz`3{9+e@eՔLy7W_XtlPK! ѐ'theme/theme/_rels/themeManager.xml.relsM 0wooӺ&݈Э5 6?$Q ,.aic21h:qm@RN;d`o7gK(M&$R(.1r'JЊT8V"AȻHu}|$b{P8g/]QAsم(#L[PK-!pO[Content_Types].xmlPK-!֧6 -_rels/.relsPK-!kytheme/theme/themeManager.xmlPK-!!Z!theme/theme/theme1.xmlPK-! ѐ'( theme/theme/_rels/themeManager.xml.relsPK]#  6 |~&&&KX _Hlt340902637 _Hlt340902638h.nmhu3w80nkk4~@@~!8:KMw{|   KM>w 9<::::::::nu dNK(,2\J40e50kxqUUfh ^`hH.h ^`hH.h pLp^p`LhH.h @ @ ^@ `hH.h ^`hH.h L^`LhH.h ^`hH.h ^`hH.h PLP^P`LhH.0^`0o(0^`0o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........0^`0o(0^`0o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........0^`0o(0^`0o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........0^`0o(0^`0o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........hhh^h`OJQJo(hHh 88^8`hH.h L^`LhH.h   ^ `hH.h   ^ `hH.h xLx^x`LhH.h HH^H`hH.h ^`hH.h L^`LhH.0^`0o(0^`0o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........0^`0o(0^`0o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........(,xqU2\J4Ufnu 0e50                  $(!z +z +$-aZlq6&X cd F 11 V gq "I m mD^s[.|f:Ip?qq!/'<A|Ir:: Q !5!| "Cg"P#$$%W%bW'\p'R(\*"*PK+y,y,~-j7.Q.jq.1K1M85KK6[8X8N9W;oc;<L<\<bv<)x<~<D>?5/AoqA1vB D!%D+DaWEFtF-GAGJaILKEMoOePYQSCS+bTjTm.U WAYS/ZLZo[_ia bdFcHeWfJ[gf&hC:h5OjkxKkNIlzdl2mqfm8qI\q rPrjix*uxBy%HyFzSnzv2} B@Wa_0D jfstq#[q"n8Ey!FM9Cc"euT8JS+vV&`tN dWU)d <X|#}JlN9jiH+d /Pj!s^=U=Hp.mRw ;S::^4F]ho$^JXSnIzDi}U<<jU2w)Mh !bZ2MXNvLM$f?SlA]@v}X@UnknownG*Ax Times New Roman5Symbol3 *Cx Arial7@ CalibriA$BCambria Math"qhċ ċ 7724qq 3QHP(?s2!xx ,-OBIS DATA QUALITY ASSURANCE (Leen Vandepitte)DKWard Appeltans,         Oh+'0 $ H T `lt|'0OBIS DATA QUALITY ASSURANCE (Leen Vandepitte)DK Normal.dotmWard Appeltans2Microsoft Macintosh Word@^в@`d@`d ՜.+,D՜.+,d  hp  'Hewlett-Packard7q .OBIS DATA QUALITY ASSURANCE (Leen Vandepitte) Title  8@ _PID_HLINKS'AqOhttp://www.iode.org/index.php?option=com_oe&task=viewDocumentRecord&docID=9174  !"#$%&'()*+,-./012345679:;<=>?ABCDEFGJRoot Entry FLjL1Table6WordDocument.6SummaryInformation(8DocumentSummaryInformation8@CompObj` F Microsoft Word 97-2004 DocumentNB6WWord.Document.8