Loading…
This event has ended. Create your own event on Sched.
Join the 2020 ESIP Winter Meeting Highlights Webinar on Feb. 5th at 3 pm ET for a fast-paced overview of what took place at the meeting. More info here.

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Tuesday, January 7
 

11:00am EST

Analytic Centers for Air Quality
The Analytic Center Framework (ACF) is a concept to support scientific investigations with a harmonized collection of data from a wide range of sources and vantage points, tools and computational resources. Four recent NASA AIST competitive awards are focused on either ACFs or components which could feed into AQ ACF's. Previous projects have developed tools and improved the accessibility and usability of data for Air Quality analysis, and have tried to address issues related to inconsistent metadata, uncertainty quantification, interoperability among tools and computing resources and visualization to aid scientific investigation or applications. The format for this meeting will be a series of brief presentati.ons by invited speakers followed by a discussion. This generally follows the panel model How to Prepare for this Session: A link to a set of pre-read materials will be provided.

View Recording: https://youtu.be/fy4eoOfSbpo.

Takeaways
  • Is there enough interest to start an Air Quality cluster? Yes!
  • Technologists and scientists should both be involved in the cluster to ensure usability through stakeholder engagement


Speakers
ML

Mike Little

ESTO, NASA
Computational Technology to support scientific investigations


Tuesday January 7, 2020 11:00am - 12:30pm EST
Glen Echo
  Glen Echo, Working Session

11:00am EST

Interoperability of geospatial data with STAC
SpatioTemporal Asset Catalogs is an emerging specification of a common metadata model for geospatial data, and a way to make data catalogs indexable and searchable. We have already seen STAC being adopted for both public data and commercial data. Catalogs exist for several AWS Public Datasets, Landsat Collection 2 data will be published along with STAC metadata, and communities like Pangeo are using STAC to organize data repositories in a scalable way. Commercial companies like Planet and Digital Globe are starting to publish STAC metadata for some of their catalogs. Session talks may cover overviews of the STAC, software projects utilizing STAC, and use cases of STAC in organizations. How to Prepare for this Session: See https://stacspec.org/.

View Recording:https://youtu.be/BdZbJLQSNFE.

Takeaways


Speakers
avatar for Dan Pilone

Dan Pilone

Chief Technologist, Element 84
Dan Pilone is CEO/CTO of Element 84 and oversees the architecture, design, and development of Element 84's projects including supporting NASA, the USGS, Stanford University School of Medicine, and commercial clients. He has supported NASA's Earth Observing System for nearly 13 years... Read More →
avatar for Aimee Barciauskas

Aimee Barciauskas

Data engineer, Development Seed
MH

Matthew Hanson

Element 84
STAC


Tuesday January 7, 2020 11:00am - 12:30pm EST
White Flint
  White Flint, Breakout

2:00pm EST

COPDESS: Facilitating a Fair Publishing Workflow Ecosystem
COPDESS, the Coalition for Publishing Data in the Earth and Space Sciences (https://copdess.org/), was established in October 2014 as a platform for Earth and Space Science publishers and data repositories to jointly define, implement, and promote common policies and procedures for the publication and citation of data and other research results (e.g., samples, software, etc.) across Earth Science journals. In late 2018, COPDESS became a cluster of ESIP to give the initiative the needed sustainability to support a long-term FAIR publishing workflow ecosystem and be a springboard to pursue future enhancements of it.

In 2017, with funding from the Arnold Foundation, the ‘Enabling FAIR Data Project’ (https://copdess.org/enabling-fair-data-project/) moved mountains towards implementing the policies and standards that connect researchers, publishers, and data repositories in their desire to accelerate scientific discovery through open and FAIR data. Implementation of the new FAIR policies has advanced rapidly across Earth, Space, and Environmental journals, but supporting infrastructure, guidelines, and training for researchers, publishers, and data repositories has yet to catch up. The primary challenges are:
  • Repositories struggle to keep up with the demands of researchers, who want to be able to instantly deposit data and obtain a DOI, without considering the data quality/data ingest requirements and review procedures of individual repositories - producing a situation where data publication is inconsistent in quality and content.
  • Many publishers who have signed the Commitment Statement for FAIR Data (https://copdess.org/enabling-fair-data-project/commitment-statement-in-the-earth-space-and-environmental-sciences/) agree with it at a high, conceptual level. However, many journal editors and reviewers lack clarity on how to validate that datasets, which underpin scholarly publications, conform with the Commitment Statement.
  • Researchers experience confusion, and in some cases barriers to publication of their papers whilst they try and meet the requirements of the commitment statement. Clarity of requirements, timelines, and criteria for selecting repositories are needed to minimize the barriers to the joint publication of papers and associated data.

Funders have a role to play, in that they need to allow for time and resources required to curate data and ensure compliance, particularly WRT to the assignment of valid DOIs. Funders can also begin to reward those researchers who do take the effort to properly manage and make their data available, in a similar way to how they reward scholarly publications and citation of those publications.

The goal of this session is to start a conversation on developing an integrated publishing workflow ecosystem the seamlessly integrates researchers, repositories, publishers and funders. Perspectives from all viewpoints will be presented.

Notes document: https://docs.google.com/document/d/12M0F6mcUZSn2GdBN-Id__smXhYxbLzKDrAViPAgnH6w/edit?usp=sharing

Presentations:

View Recording: https://youtu.be/x6a1QRNbifQ

Takeaways
  • COPDESS has moved to ESIP as a cluster to ensure the sustainability of the project to address the publishing & citation of research data



Speakers
avatar for Karl Benedict

Karl Benedict

Director of Research Data Services & Information Technology, University of New Mexico
Since 1986 I have had parallel careers in Information Technology, Data Management and Analysis, and Archaeology. Since 1993 when I arrived at UNM I have worked as a Graduate Student in Anthropology, Research Scientist, Research Faculty, Applied Research Center Director, and currently... Read More →
avatar for Kerstin Lehnert

Kerstin Lehnert

President, IGSN e.V.
Kerstin Lehnert is Senior Research Scientist at the Lamont-Doherty Earth Observatory of Columbia University and Director of EarthChem, the System for Earth Sample Registration, and the Astromaterials Data System. Kerstin holds a Ph.D in Petrology from the University of Freiburg in... Read More →
avatar for Lesley Wyborn

Lesley Wyborn

Adjunct Fellow, Australian National University


Tuesday January 7, 2020 2:00pm - 3:30pm EST
Salon A-C
  Salon A-C, Breakout

2:00pm EST

Current Data that are available on the Cloud
NASA, NOAA and USGS are in the process of moving data onto the cloud. While they have discussed what types of services are available and future plans of what data can be found, it is not completely clear what datasets users can currently access. This session will go over what datasets are currently up in the cloud and what data to expect in the near future. This way as users are transitioning to the cloud for their compute, they can also know what data are available to them on the cloud as well. There will also be presentations from AWS. Speakers:
Katie Baynes - NASA/EOSDIS
Jon O'Neil - NOAA
Jeff de La Beaujardiere - NCAR
Kristi Kliene - USGS/EROS
Joe Flasher - AWS

Presentations: See attached.

View Recording: https://youtu.be/yssgXB7iaxw

Takeaways
  • Petabyte scale data is being moved into the cloud. This is concentrated in AWS, Google Cloud and Microsoft depending on the agency and dataset
  • Some concern around partnerships with companies (AWS most discussed) in terms of long term relationships, moving data etc. and how those things might impact access or data use
  • Need to make clear the authoritative source of the data, who is stewarding it, and any modifications done when copying to cloud. Users should exercise due diligence in selecting and using data.



Speakers
JO

Jon O'Neil

Director, NOAA Big Data Program, NOAA
avatar for Joe Flasher

Joe Flasher

Open Geospatial Data Lead, Amazon Web Services
Joe Flasher is the Open Geospatial Data Lead at Amazon Web Services helping organizations most effectively make data available for analysis in the cloud. The AWS open data program has democratized access to petabytes of data, including satellite imagery, genomic data, and data used... Read More →
avatar for Christopher Lynnes

Christopher Lynnes

Systems Architect, NASA/EOSDIS, NASA/GSFC
Christopher Lynnes is currently System Architect for NASA’s Earth Observing System Data and Information System, known as EOSDIS. He has been working on EOSDIS since 1992, over which time he has worked multiple generations of data archive systems, search engines and interfaces, science... Read More →
avatar for Jessica Hausman

Jessica Hausman

Data Engineer, PO.DAAC JPL
avatar for Jeff de La Beaujardière

Jeff de La Beaujardière

Director, Information Systems Division, NCAR
Big data, cloud computing, object storage, data management.
avatar for Dave Meyer

Dave Meyer

GES DISC manager, NASA


Tuesday January 7, 2020 2:00pm - 3:30pm EST
White Flint
  White Flint, Breakout
 
Wednesday, January 8
 

11:00am EST

Software Sustainability, Discovery and Accreditation
It is commonly understood that software is essential to research, in data collection, curation, analysis, and understanding, and it is also a critical element within any research infrastructure. This session will address two related software issues: 1) sustainability, and 2) discovery and accreditation.

Because scientific software is an instance of a software stack containing problem-specific software, discipline-specific tools, general tools and middleware, and infrastructural software, changes within the stack can cause the overall software to collapse and stop working, and as time goes on, work is increasingly needed to compensate for these problems, which we refer to as sustainability. Issues in which we are interested include incentives that encourage sustainability activities, business models for sustainability (including public-private partnership), software design that can reduce the sustainability burden, and metrics to measure sustainability (perhaps tied to the on-going process of defining FAIR software).

The second issue, discovery and accreditation, asks how we enable users to discover and access trustworthy and fit-for-purpose software to undertake science processing on the compute infrastructures to which they have access? And how do we ensure that publications cite the exact version of software that was used and is cited and properly credited the responsible authors?

This session will include a number of short talks, and at least two breakouts in parallel, one about the sustainability of software, and a second about discovery of sustainable and viable solutions.

Potential speakers who want to talk about an aspect of software sustainability, discovery, or accreditation should contact the session organizers.

Agenda/slides:
Presentations: See above

View Recording:
https://youtu.be/nsxjOC04JxQ

Key takeaways:

1. Funding agencies spend a large amount of money on software, but don't always know this because it's not something that they track.

OpenSource software is growing very quickly:
  • 2001: 208K SourceForge users
  • 2017: 20M GitHub users
  • 2019: 37M Github users
Software, like data, is a “first class citizen” in the ecosystem of tools and resources for scientific research and our community is accelerating their attention to this as they have for FAIR data


2. Ideas for changing our culture to better support and reward contributions to sustainable software:
  • Citation (ESIP guidelines) and/or software heritage IDs for credit and usage metrics and to meet publisher requirements (e.g. AGU)
  • Prizes
  • Incentives in hiring and promotion
  • Promote FAIR principles and/or Technical Readiness Levels for software
  • Increased use to make science more efficient through common software
  • Publish best practice materials in other languages, e.g. Mandarin, as software comes from a global community


3. A checklist of topics to consider for your community sustained software:
  • Repository with “cookie cutter” templates and sketches for forking
  • Licensing
  • Contributors Guide
  • Code of Conduct and Governance
  • Use of “Self-Documentation” features and standards
  • Easy step for trying out software
  • Continuous Integration builds
  • Unit tests
  • Good set of “known first issues” for new users trying out the software
  • Gitter or Slack Channel for feedback and communication, beyond a simple repo issues queue


Detailed notes:
The group then divided into 2 breakout sessions (Sustainability; Discovery and Accreditation), with notes as follows.

Notes from Sustainability breakout (by Daniel S. Katz):

What we think should be done:
  • Build a cookiecutter recipe for new projects, based on Ben’s slides?  What part of ESIP would be interested in this? And would do it, and support it?
  • Define governance as part of this? How do we store governance?
  • What is required, what is optional (maybe with different answers at different tiers)
  • Define types of projects (individual developer, community code, …)
  • Define for different languages – tooling needs to match needs
  • Is this specific to ESIP? Who could it be done with? The Carpentries?  SSI?

Other discussion:
  • What do we mean by sustainability – for how long?  Up to 50 years?  How do we run the system?
  • What’s the purpose of the software (use case) – transparency to see the software, actual reuse?
  • What about research objects that contain both software and data? How do we archive them? How do we cite them?
  • We have some overlap with research object citation cluster


Notes from Discovery and Accreditation breakout (by Shelley Stall):

Use Cases - Discovery
  1. science question- looking for software to support
  2. have some data output from a software process, need to gain access to the software to better understand the data.   

Example of work happening: Data and Software Preservation - NSF Funded
  • promote linked data to other research products
  • similar project in Australia - want to gain access to the chain of events that resulted in the data and/or software - the scientific drivers that resulted in this product
  • Provenance information is part of this concept.

A deeper look at discovery, once software is found, is to better understand how the software came into being. It is important to know the undocumented elements of a process that effected/impacted the chain of events that are useful information to understand for a particular piece of software.
How do we discover existing packages?
Dependency management helps to discover new elements that support software.
Concern expressed that packaged solution for creating an environment, like “AWS/AMI”, are not recognized as good enough, that an editor requested a d

Speakers
avatar for Daniel S. Katz

Daniel S. Katz

Assistant Dir. for Scientific Software & Applications, NCSA; Research Assoc. Prof., CS, ECE, iSchool, University of Illinois
avatar for Lesley Wyborn

Lesley Wyborn

Adjunct Fellow, Australian National University


Wednesday January 8, 2020 11:00am - 12:30pm EST
Forest Glen
  Forest Glen, Working Session

11:00am EST

Pangeo in Action
The NSF-funded Pangeo project (http://pangeo.io/) is a community-driven architectural framework for big data geoscience. A typical Pangeo software stack leverages Python open-development libraries including elements such as Jupyter Notebooks for interactive data analysis, Intake catalogs to provide a higher level of abstraction, Dask for scalable, parallelized data access, and Xarray for working with labeled multi-dimensional arrays of data, and can support data formats including NetCDF as well the cloud-optimized Zarr format for chunked, compressed, N-dimensional arrays.

This session includes presentations describing implementations, results, or lessons learned from using these tools, as well as some time for open discussion. We encourage attendance by people interested in knowing more about Pangeo.

Draft schedule:
Dr. Amanda Tan, U. Washington: Pangeo overview and lessons learned
Dr. Rich Signell, USGS: The USGS EarthMap Pangeo: Success Stories and Lessons Learned
Dr. Jeff de La Beaujardière, NCAR: Climate model outputs on AWS using Pangeo framework
Dr. Karl Benedict, UNM: Pangeo as a platform for workshops
Open discussion

How to Prepare for this Session:

Presentations:
https://doi.org/10.6084/m9.figshare.11559174.v1

View Recording: https://youtu.be/VNfpGIIjL3E.

Takeaways
  • Pangeo is a community platform for Big Data geoscience; A cohesive ecosystem of open community, open source software, open ecosystem; Three core python packages: jupyter, xarray, Dask
  • Deploying Pangeo on cloud face challenges
    • Cloud costs
    • Cloud skills
    • Need of cloud-optimized data
    • Best strategy of pangeo deployment in the changing cloud services platform
  • Pangeo can be applied to leverage the jupyter notebook and other resources for different level of data users (NCAR: scientists new to cloud computing platform; University of New Mexico: workshop platform etc)

Speakers
avatar for Karl Benedict

Karl Benedict

Director of Research Data Services & Information Technology, University of New Mexico
Since 1986 I have had parallel careers in Information Technology, Data Management and Analysis, and Archaeology. Since 1993 when I arrived at UNM I have worked as a Graduate Student in Anthropology, Research Scientist, Research Faculty, Applied Research Center Director, and currently... Read More →
avatar for Rich Signell

Rich Signell

Oceanographer, USGS
Ocean Modeling, Python, NetCDF, THREDDS, ERDDAP, UGRID, SGRID, CF-Conventions, Jupyter, JupyterHub, CSW, TerriaJS
avatar for Amanda Tan

Amanda Tan

Data Scientist, University of Washington
Cloud computing, distributed systems
avatar for Jeff de La Beaujardière

Jeff de La Beaujardière

Director, Information Systems Division, NCAR
Big data, cloud computing, object storage, data management.


Wednesday January 8, 2020 11:00am - 12:30pm EST
Linden Oak
  Linden Oak, Breakout

11:00am EST

FAIRtool.org, Serverless workflows for cubesats, Geoweaver ML workflow management, 3D printed weather stations
Come hear what ESIP Lab PIs have built over the past year. Speakers include:

Abdullah Alowairdhi: FAIRTool Project Update
Ziheng Sun: Geoweaver Project
Amanda Tan: Serverless Workflow Project
Agbeli Ameko: 3D-Printed Weather Stations

Presentations:
https://doi.org/10.6084/m9.figshare.11626284.v1

View Recording: https://youtu.be/vrRwEQRAIZ4

Takeaways



Speakers
avatar for Amanda Tan

Amanda Tan

Data Scientist, University of Washington
Cloud computing, distributed systems
avatar for Abdullah Alowairdhi

Abdullah Alowairdhi

PhD Candedate, U of Idaho
avatar for Ziheng Sun

Ziheng Sun

Research Assistant Professor, George Mason University
My research interests are mainly on geospatial cyberinfrastructure and agricultural remote sensing.
avatar for Annie Burgess

Annie Burgess

ESIP Lab Director, ESIP


Wednesday January 8, 2020 11:00am - 12:30pm EST
Salon A-C
  Salon A-C, Breakout
  • Skill Level Skim the Surface, Jump In
  • Keywords Cloud Computing, Machine Learning
  • Collaboration Area Tags Science Software
  • Remote Participation Link: https://global.gotomeeting.com/join/195545333
  • Remote Participation Phone #: (571) 317-3129
  • Remote Participation Access Code 195-545-333
  • Additional Phone #'s: Australia: +61 2 8355 1050 Austria: +43 7 2081 5427 Belgium: +32 28 93 7018 Canada: +1 (647) 497-9391 Denmark: +45 32 72 03 82 Finland: +358 923 17 0568 France: +33 170 950 594 Germany: +49 692 5736 7317 Ireland: +353 15 360 728 Italy: +39 0 230 57 81 42 Netherlands: +31 207 941 377 New Zealand: +64 9 280 6302 Norway: +47 21 93 37 51 Spain: +34 912 71 8491 Sweden: +46 853 527 836 Switzerland: +41 225 4599 78 United Kingdom: +44 330 221 0088

2:00pm EST

AI for Augmenting Geospatial Information Discovery
Thanks to the rapid developments of hardware and computer science, we have seen a lot of exciting breakthroughs in self driving, voice recognition, street view recognition, cancer detection, check deposit, etc. Sooner or later the fire of AI will burn in Earth science field. Scientists need high-level automation to discover in-time accurate geospatial information from big amount of Earth observations, but few of the existing algorithms can ideally solve the sophisticated problems within automation. However, nowadays the transition from manual to automatic is actually undergoing gradually, a bit by a bit. Many early-bird researchers have started to transplant the AI theory and algorithms from computer science to GIScience, and a number of promising results have been achieved. In this session, we will invite speakers to talk about their experiences of using AI in geospatial information (GI) discovery. We will discuss all aspects of "AI for GI" such as the algorithms, technical frameworks, used tools & libraries, and model evaluation in various individual use case scenarios. How to Prepare for this Session: https://esip.figshare.com/articles/Geoweaver_for_Better_Deep_Learning_A_Review_of_Cyberinfrastructure/9037091
https://esip.figshare.com/articles/Some_Basics_of_Deep_Learning_in_Agriculture/7631615

Presentations:
https://doi.org/10.6084/m9.figshare.11626299.v1

View Recording: https://youtu.be/W0q8WiMw9Hs

Takeaways
  • There is a significant uptake of machine learning/artificial intelligence for earth science applications in the recent decade;
  • The challenge of machine learning applications for earth science domain includes:
    • the quality and availability of training data sets;
    • Requires a team with diverse skill background to implement the application
    • Need better understanding of the underlying mechanism of ML/AI models
  • There are many promising applications/ developments on streamlining the process and application of machine learning applications for different sectors of the society (weather monitoring, emergency responses, social good)



Speakers
avatar for Yuhan (Douglas) Rao

Yuhan (Douglas) Rao

Postdoctoral Research Scholar, CISESS/NCICS/NCSU
avatar for Aimee Barciauskas

Aimee Barciauskas

Data engineer, Development Seed
avatar for Annie Burgess

Annie Burgess

ESIP Lab Director, ESIP
avatar for Rahul Ramachandran

Rahul Ramachandran

Project Manager, Sr. Research Scientist, NASA
avatar for Ziheng Sun

Ziheng Sun

Research Assistant Professor, George Mason University
My research interests are mainly on geospatial cyberinfrastructure and agricultural remote sensing.


Wednesday January 8, 2020 2:00pm - 3:30pm EST
Salon A-C
  Salon A-C, Breakout
 
Thursday, January 9
 

12:00pm EST

License Up! What license works for you and your downstream repositories?
Many repositories are seeing an increase in the use and diversity of licenses and other intellectual property management (IPM) tools applied to externally-created data submissions and software developed by staff. However, adding a license to data files may have unexpected or unintended consequences in the downstream use or redistribution of those data. Who “owns” the intellectual property rights to data collected by university researchers using Federal and State (i.e., public) funding that must be deposited at a Federal repository? What license is appropriate for those data and what — exactly — does that license allow and disallow? What kind of license or other IPM instrument is appropriate for software written by a team of Federal and Cooperative Institute software engineers? Is there a significant difference between Creative Commons, GNU, and other ‘open source licenses’?

We have invited a panel of legal advisors from Federal and other organizations to discuss the implications of these questions for data stewards and the software teams that work collaboratively with those stewards. We may also discuss the latest information about Federal data licenses as it applies to the OPEN Government Data Act of 2019. How to Prepare for this Session: Consider what, if any, licenses, copyright, or other intellectual property rights management you apply or think applies to your work. Also consider Federal requirements such as the OPEN Government Data Act of 2019, Section 508 of the Rehabilitation Act of 1973.

Speakers:
Dr. Robert J. Hanisch is the Director of the Office of Data and Informatics, Material Measurement Laboratory, at the National Institute of Standards and Technology in Gaithersburg, Maryland. He is responsible for improving data management and analysis practices and helping to assure compliance with national directives on open data access. Prior to coming to NIST in 2014, Dr. Hanisch was a Senior Scientist at the Space Telescope Science Institute, Baltimore, Maryland, and was the Director of the US Virtual Astronomical Observatory. For more than twenty-five years Dr. Hanisch led efforts in the astronomy community to improve the accessibility and interoperability of data archives and catalogs.
Henry Wixon is Chief Counsel for the National Institute of Standards and Technology (NIST) of the U.S. Department of Commerce. His office provides programmatic legal guidance to NIST, as well as intellectual property counsel and representation to the Department of Commerce and other Department bureaus. In this role, it interacts with principal developers and users of research, including private and public laboratories, universities, corporations and governments. Responsibilities of Mr. Wixon’s office include review of NIST Cooperative Research and Development Agreements (CRADAs), licenses, Non-Disclosure Agreements (NDAs) and Material Transfer Agreements (MTAs), and the preparation and prosecution of the agency’s patent applications. As Chief Counsel, Mr. Wixon is active in standing Interagency Working Groups on Technology Transfer, on Bayh-Dole, and on Research Misconduct, as well as in the Federal Laboratory Consortium. He is a Certified Licensing Professional and a Past Chair of the Maryland Chapter of the Licensing Executives Society, USA and Canada (LES), and is a member of the Board of Visitors of the College of Computer, Mathematical and Natural Sciences of the University of Maryland, College Park.

Presentations
See attached

View Recording: https://youtu.be/5Ng5FDW1LXk.

Takeaways



Speakers
DC

Donald Collins

Oceanographer, NESDIS/NCEI Archive Branch
Send2NCEI, NCEI archival processes, records management


Thursday January 9, 2020 12:00pm - 1:30pm EST
Forest Glen
  Forest Glen, Panel

12:00pm EST

Hands-on labeling workshop
Intended as a follow on to the "Do You Have a Labeling Problem?" session and to get your feet wet, this working session is for people to experiment with two of the tools presented in that session, Labelimg and Bokeh. Presenters will provide some sample data for participants to work with. Attendees can also bring some of their own data to work with in the time remaining after the planned activities.

It would be best for workshop participants to preinstall Labelimg before coming to the session.   Regarding Bokeh, Anaconda is providing 25 accounts for workshop participants. (Thank you, Jim and Anaconda!).  Installing Bokeh is also an option.  Links for getting these tools are:
  • Labelimg via https://github.com/tzutalin/labelImg#installation
  • Bokeh as part of the HoloViz suite via http://holoviz.org/installation.html

Presentations

View Recording: https://youtu.be/y8NqTLgT8Ao

Takeaways


Speakers
avatar for Ziheng Sun

Ziheng Sun

Research Assistant Professor, George Mason University
My research interests are mainly on geospatial cyberinfrastructure and agricultural remote sensing.
avatar for Anne Wilson

Anne Wilson

Senior Software Engineer, Laboratory for Atmospheric and Space Physics
avatar for Yuhan (Douglas) Rao

Yuhan (Douglas) Rao

Postdoctoral Research Scholar, CISESS/NCICS/NCSU


Thursday January 9, 2020 12:00pm - 1:30pm EST
Glen Echo
  Glen Echo, Workshop

12:00pm EST

Datacubes for Analysis-Ready Data: Standards & State of the Art
This workshop session will follow up on the OGC Coverage Analytics sprint, focusing specifically on advanced services for spatio-temporal datacubes. In the Earth sciences datacubes are accepted as an enabling paradigm for offering massive spatio-temporal Earth data analysis-ready, more generally: easing access, extraction, analysis, and fusion. Also, datacubes homogenizes APIs across dimensions, allowing unified wrangling of 1-D sensor data, 2-D imagery, 3-D x/y/t image timeseries and x/y/z geophysics voxel data, and 4-D x/y/z/t climate and weather data.
Based on the OGC datacube reference implementation we introduce datacube concepts, state of standardization, and real-life 2D, 3D, and 4D examples utilizing services from three continents. Ample time will be available for discussion, and Internet-connected participants will be able to replay and modify many of the examples shown. Further, key datacube activities worldwide, within and beyond Earth sciences, will be related to.
Session outcomes could take a number of forms: ideas and issues for OGC, ISO, or ESIP to consider; example use cases; challenges not yet addressed sufficiently, and entirely novel use cases; work and collaboration plans for future ESIP work. Outcomes of the session will be reported at the next OGC TC meeting's Big Data and Coverage sessions. How to Prepare for this Session: Introductory and advanced material is available from http://myogc.org/go/coveragesDWG

Presentations
https://doi.org/10.6084/m9.figshare.11562552.v1

View Recording: https://youtu.be/82WG7soc5bk

Takeaways
  • Abstract coverage construct defines the base which can be filled up with a coverage implementation schema. Important as previously implementation wasn’t interoperable with different servers and clients. 
  • Have embedded the coordinate system retrieved from sensors reporting in real time into their xml schema to be able to integrate the sensor data into the broader system. Can deliver the data in addition to GML but JSON, and RDF which could be used to link into semantic web tech. 
  • Principle is send HTTP url-encoded query to server and get some results that are extracted from datacube, e.g., sources from many hyperspectral images.

Speakers

Thursday January 9, 2020 12:00pm - 1:30pm EST
White Flint