|  Home  |  Ocean Data Portal web site  |  News  |  Events  |  Search  |  Blog  | 
Members Login



Create an account
Forgot Your Password?

Scope 
Links 

Home  What is E2EDM? 



  • Vision statement

    • Pilot should demonstrate real-time access to, and fusion of, data
      • at operational time scale
      • across multiple disciplines
      • preferably non-traditional variables
      • from multiple source formats
      • from multiple providers in different geographic regions
      • of utility to some user group
    • The pilot should demonstrate the full range of processes including data
      discovery, access, and visualisation
    • It should use pre-existing components where possible and be achievable with
      modest incremental effort

  • High-level functionality

    The following functionality is envisaged:

    1. A user can enter the system, either via a web browser or a dedicated client, and request data of a single or multiple types, from a distributed set of sources, over a single (or possibly multiple) space-time region(s)
    2. Appropriate data to the user’s request will be automatically sourced from wherever it resides, and returned to the requesting machine (which may be the user’s machine, or an intermediate portal providing value-added services)
    3. Tools will exist (again either on a dedicated client, or on an intermediate portal) to fuse the aggregated data in real time to produce a newly created data product of value to the user.

  • Conceptual components required

    The pilot E2EDM system requires the following components:

    1. Data sources, with data of potential interest to the system, and the technological means for such data to be accessed

    2. A master list of such sources - which could be generated as a virtual list by querying one or multiple sources, or reside as an independent entity

    3. "System search" metadata for each source, which describes at a high level, in a machine-readable structured way, at least the following:

      • Data class - according to an agreed semantic model yet to be defined (e.g. satellite data, in situ oceanographic data, biological data ...)
      • Parameter list (according to agreed semantic model)
      • Overall space, time footprint (according to ISO metadata standard)
      • Location of, and access protocol for remote requests to connect to the data

    4. For complex data providers, e.g. sources of data on multiple parameters with discontinuous distributions in time and/or space, more detailed search metadata describing the individual space-time footprints of every parameter (e.g. different biological species distributions)

    5. One or more "request brokers" capable of querying first the search metadata, then the relevant data sources, to retrieve data relevant to the user’s request. (Such a "request broker" could either be client software installed on the user’s machine, or a dedicated portal to which the user connects via a standard web browser)

    6. One or more user interfaces which permit the user to formulate an appropriate request

    7. One or more applications capable of generation of real-time data products from the data returned as a result of the distributed query

    8. Relevant software and hardware to connect the various components of the system, and
    9. Relevant data and metadata models to ensure that requests can be formulated by the request broker, and responded to, in a consistent manner.

    Commentary: The above list attempts to identify the components which will be required, but makes no final decision as to whether they may exist as real or distributed entities, or where they should reside. For example, the system search metadata described in point (3) has an obvious overlap with the conventional thematic metadata directories (GCMD, EDMED, MEDI, others) and could conceptually reside there in "distributed" form, alternatively it could reside in a separate "registry" more directly under control of the "owners" of the distributed JCOMM system (one could even start with one model, and migrate to another model over time).
    Similarly, the more detailed search metadata described in (4) could reside in an intermediate registry or cache, or be generated on demand from the data sources in real time, or simply be ignored for the purpose of the pilot project.

  • Proposed methodology

    1. Agree on a single high-level architecture for the prototype system. Questions to be decided here will include:

      • Will the "master list" of accessible data providers described above in (2) be generated on demand from another source (e.g. GCMD, or distributed metadata query), or maintained as a separate entity, for the purposes of this pilot project.
      • Will the "system search metadata" required for this project will reside in such metadata directories along with the thematic metadata, or in a dedicated registry
      • Will there be a need for more detailed information to be stored, on specific space-time footprints by individual parameter as described above at E(4),or whether it is sufficient to generate such information via real-time request to the data sources
      • Whether the "request broker" described above at (5) will comprise client software to be installed on user’s machines, or whether there will be a single portal (or replicated portals) providing such functions, accessed via a user's web browser (or both)

      Comment: Such decisions should include an analysis of the strengths and weaknesses of existing architectures of similar systems, e.g. DODS/NVODS, OBIS, others.

      Duration: July - December 2004

    2. Identify a limited, but challenging suite of parameters to be accessible via the pilot system. For example such parameters might include one to a few oceanographic in situ measurements (temperature, salinity); satellite imagery (e.g. ocean colour); marine meteorological and biological observations (e.g. accessible via OBIS or independent source)
      Estimated duration: 1-3 months
      Target date: 2004

    3. Identify a set of data providers who are agreeable to becoming test initial "JCOMM E2EDM data sources" for the purpose of this pilot project. (Target: a suite of between 5 and 10 data sources, representing a range of themes and geographic locations of potential value to a JCOMM user).
      Estimated duration: 1-3 months
      Target date: 2004

    4. Construct semantic data models for:

      • Required "system search" metadata
      • The syntax for "system search metadata" requests and responses
      • The syntax for data requests and responses
      Comment: Point 4a should be considered with reference to existing metadata standards (e.g. ISO 19115), the JCOMM community profile of same (under development via parallel metadata pilot project), and additional specific fields required for this project
      Estimated duration: 3 months
      Target date: 2004

    5. Investigate viability of using existing software components (e.g. OpenDAP, DiGIR) for interfacing with providers' data systems using schemas and syntaxes developed under (4) above, and test install in at least 2 locations, to discover and overcome any problems
      Estimated duration: 3 months
      Target date: September 2004

    6. Develop a specification for the user interface (e.g. web interface, personal client software, automated system), construct and refine prototype
      Estimated duration: 3 months initial, plus ongoing refinement
      Target date: 2004, plus ongoing refinement

    7. Develop a specification for the real-time data fusion and visualisation software, construct and refine prototype, or investigate currently available products - e.g. OceanDataView
      Estimated duration: 3 months initial, plus ongoing refinement

      Target date: 2004, plus ongoing refinement
    8. Connect the various components, assess performance of the system, and refine as necessary
      Estimated duration: 3 months
      Target date: December 2004

    9. Consider ongoing maintenance and management requirements of the system – e.g. automating repetitive functions, refreshing registry content as required, manual oversight of system functionality, method for extending range of either parameters covered or data sources, method for generating and disseminating system metrics
      Estimated duration: 3 months
      Target date: December 2004

    10. Document the system in its "version 1" form, communicate results to JCOMM and/or other interested parties
      Estimated duration: 1-3 months
      Target date: March 2005

  • Deliverables

    1. Working prototype system, demonstrating the achievement of the project objective
    2. Written report to JCOMM
    3. Presentation of results at a significant technical workshop or scientific conference

  • Linkages with other proposed Pilot Projects

    1. Require output for JCOMM metadata model, for use and/or possible extension for this project
    2. If GCMD etc. are adopted as the repository for the "system search metadata", may require the distributed search mechanism which is also a planned output from this project

©National Oceanographic Data Centre of Russia, RIHMI-WDC

Rambler's Top100