Loading…
This event has ended. Create your own event on Sched.
Join the 2020 ESIP Winter Meeting Highlights Webinar on Feb. 5th at 3 pm ET for a fast-paced overview of what took place at the meeting. More info here.

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Cloud Computing [clear filter]
Monday, January 6
 

4:00pm EST

Council of Data Facilities General Assembly Meeting
The Council of Data Facilities (CDF) is committed to working with relevant agencies, professional associations, initiatives, and other complementary efforts to enable transformational science, innovative education, and informed public policy through increased coordination, collaboration, and innovation in the acquisition, curation, preservation, and dissemination of geoscience data, tools, models, and services. Existing and emerging geoscience data facilities – through the Council – are committed to serving as an effective foundation for EarthCube. The General Assembly meeting is open to the official representatives from all member data facilities, additional member organization personnel as desired by the members, as well as observers. How to

Agenda:
400-415 Welcome/introductions/sign-in - Danie415-430 High level Summary of OKN workshop - TBA
430-435 Updates on shared infrastructure - Kerstin, Danie
435-445 Update on COPDESS-Kerstin, Shelley
445-515 Update and next steps on P419-Doug, Adam
515-530 Progress on EC supplements for CCHDO and MagIC related to P418/P419 (GeoCODES)-Steve
530-550 Update from tech team EarthCube Office-Kenton McHenry
550-600 Summer topics - Danie
      • Suggested Charter changes (to be voted on at july 2020)
      • Announce  CDF exec elections in july 2020 - 2 co-chair and 3 at large positions


Speakers
avatar for Jessica Hausman

Jessica Hausman

Data Engineer, PO.DAAC JPL


Monday January 6, 2020 4:00pm - 6:00pm EST
Glen Echo
  Glen Echo, Business Meeting
 
Tuesday, January 7
 

11:00am EST

Interoperability of geospatial data with STAC
SpatioTemporal Asset Catalogs is an emerging specification of a common metadata model for geospatial data, and a way to make data catalogs indexable and searchable. We have already seen STAC being adopted for both public data and commercial data. Catalogs exist for several AWS Public Datasets, Landsat Collection 2 data will be published along with STAC metadata, and communities like Pangeo are using STAC to organize data repositories in a scalable way. Commercial companies like Planet and Digital Globe are starting to publish STAC metadata for some of their catalogs. Session talks may cover overviews of the STAC, software projects utilizing STAC, and use cases of STAC in organizations. How to Prepare for this Session: See https://stacspec.org/.

View Recording:https://youtu.be/BdZbJLQSNFE.

Takeaways


Speakers
avatar for Dan Pilone

Dan Pilone

Chief Technologist, Element 84
Dan Pilone is CEO/CTO of Element 84 and oversees the architecture, design, and development of Element 84's projects including supporting NASA, the USGS, Stanford University School of Medicine, and commercial clients. He has supported NASA's Earth Observing System for nearly 13 years... Read More →
avatar for Aimee Barciauskas

Aimee Barciauskas

Data engineer, Development Seed
MH

Matthew Hanson

Element 84
STAC


Tuesday January 7, 2020 11:00am - 12:30pm EST
White Flint
  White Flint, Breakout

2:00pm EST

Current Data that are available on the Cloud
NASA, NOAA and USGS are in the process of moving data onto the cloud. While they have discussed what types of services are available and future plans of what data can be found, it is not completely clear what datasets users can currently access. This session will go over what datasets are currently up in the cloud and what data to expect in the near future. This way as users are transitioning to the cloud for their compute, they can also know what data are available to them on the cloud as well. There will also be presentations from AWS. Speakers:
Katie Baynes - NASA/EOSDIS
Jon O'Neil - NOAA
Jeff de La Beaujardiere - NCAR
Kristi Kliene - USGS/EROS
Joe Flasher - AWS

Presentations: See attached.

View Recording: https://youtu.be/yssgXB7iaxw

Takeaways
  • Petabyte scale data is being moved into the cloud. This is concentrated in AWS, Google Cloud and Microsoft depending on the agency and dataset
  • Some concern around partnerships with companies (AWS most discussed) in terms of long term relationships, moving data etc. and how those things might impact access or data use
  • Need to make clear the authoritative source of the data, who is stewarding it, and any modifications done when copying to cloud. Users should exercise due diligence in selecting and using data.



Speakers
JO

Jon O'Neil

Director, NOAA Big Data Program, NOAA
avatar for Joe Flasher

Joe Flasher

Open Geospatial Data Lead, Amazon Web Services
Joe Flasher is the Open Geospatial Data Lead at Amazon Web Services helping organizations most effectively make data available for analysis in the cloud. The AWS open data program has democratized access to petabytes of data, including satellite imagery, genomic data, and data used... Read More →
avatar for Christopher Lynnes

Christopher Lynnes

Systems Architect, NASA/EOSDIS, NASA/GSFC
Christopher Lynnes is currently System Architect for NASA’s Earth Observing System Data and Information System, known as EOSDIS. He has been working on EOSDIS since 1992, over which time he has worked multiple generations of data archive systems, search engines and interfaces, science... Read More →
avatar for Jessica Hausman

Jessica Hausman

Data Engineer, PO.DAAC JPL
avatar for Jeff de La Beaujardière

Jeff de La Beaujardière

Director, Information Systems Division, NCAR
Big data, cloud computing, object storage, data management.
avatar for Dave Meyer

Dave Meyer

GES DISC manager, NASA


Tuesday January 7, 2020 2:00pm - 3:30pm EST
White Flint
  White Flint, Breakout

4:00pm EST

Experiences Migrating Mission Scale Data in the Cloud
We will describe our project to upload a 2.4 PB dataset encapsulated into ~80K fused files from the 5 instruments on the Terra satellite into NASA AWS S3.
We will share the bottlenecks points and lessons learned during this process and expect to share experiences with similar projects in order to understand the best practices and collect guidelines for future projects that are adopting cloud solutions for their data needs.

We'll discuss data volumes, data integrity strategies for migration, S3 bucket organization, metadata curation, transfer rates, transfer pipelines, etc. We will also discuss and share data access patterns, costs, and architectures and how we can construct guidelines for access to these datasets efficiently.

We encourage the discussion among different projects that faced similar processes or are looking to migrate their datasets into the cloud.

https://drive.google.com/file/d/1fts06XDM2dbZxxljBTpplCEMSiTqfp6t/view?usp=sharing

Presentations:
https://doi.org/10.6084/m9.figshare.11553147.v1

View Recording: https://youtu.be/1xVJghJI4Gg

Takeaways
  • Project required/used a combination of NSF, NASA and AWS resources. Some interesting discussion around AWS or other cloud services as a stand in or follow on to limited term NSF assets
  • Some interesting discussion of tailoring to appropriate end users- wide range of potential users and thus requirements for the dataset. This includes access guidelines, user capabilities etc.
  • Project aimed to make a paradigm shift from understanding/observing physical processes to a full climate observing objective



Speakers
avatar for Ben Galewsky

Ben Galewsky

Research Programmer, National Center for Supercomputing Applications Connect Message


Tuesday January 7, 2020 4:00pm - 5:30pm EST
White Flint
  White Flint, Breakout
 
Wednesday, January 8
 

11:00am EST

Pangeo in Action
The NSF-funded Pangeo project (http://pangeo.io/) is a community-driven architectural framework for big data geoscience. A typical Pangeo software stack leverages Python open-development libraries including elements such as Jupyter Notebooks for interactive data analysis, Intake catalogs to provide a higher level of abstraction, Dask for scalable, parallelized data access, and Xarray for working with labeled multi-dimensional arrays of data, and can support data formats including NetCDF as well the cloud-optimized Zarr format for chunked, compressed, N-dimensional arrays.

This session includes presentations describing implementations, results, or lessons learned from using these tools, as well as some time for open discussion. We encourage attendance by people interested in knowing more about Pangeo.

Draft schedule:
Dr. Amanda Tan, U. Washington: Pangeo overview and lessons learned
Dr. Rich Signell, USGS: The USGS EarthMap Pangeo: Success Stories and Lessons Learned
Dr. Jeff de La Beaujardière, NCAR: Climate model outputs on AWS using Pangeo framework
Dr. Karl Benedict, UNM: Pangeo as a platform for workshops
Open discussion

How to Prepare for this Session:

Presentations:
https://doi.org/10.6084/m9.figshare.11559174.v1

View Recording: https://youtu.be/VNfpGIIjL3E.

Takeaways
  • Pangeo is a community platform for Big Data geoscience; A cohesive ecosystem of open community, open source software, open ecosystem; Three core python packages: jupyter, xarray, Dask
  • Deploying Pangeo on cloud face challenges
    • Cloud costs
    • Cloud skills
    • Need of cloud-optimized data
    • Best strategy of pangeo deployment in the changing cloud services platform
  • Pangeo can be applied to leverage the jupyter notebook and other resources for different level of data users (NCAR: scientists new to cloud computing platform; University of New Mexico: workshop platform etc)

Speakers
avatar for Karl Benedict

Karl Benedict

Director of Research Data Services & Information Technology, University of New Mexico
Since 1986 I have had parallel careers in Information Technology, Data Management and Analysis, and Archaeology. Since 1993 when I arrived at UNM I have worked as a Graduate Student in Anthropology, Research Scientist, Research Faculty, Applied Research Center Director, and currently... Read More →
avatar for Rich Signell

Rich Signell

Oceanographer, USGS
Ocean Modeling, Python, NetCDF, THREDDS, ERDDAP, UGRID, SGRID, CF-Conventions, Jupyter, JupyterHub, CSW, TerriaJS
avatar for Amanda Tan

Amanda Tan

Data Scientist, University of Washington
Cloud computing, distributed systems
avatar for Jeff de La Beaujardière

Jeff de La Beaujardière

Director, Information Systems Division, NCAR
Big data, cloud computing, object storage, data management.


Wednesday January 8, 2020 11:00am - 12:30pm EST
Linden Oak
  Linden Oak, Breakout

11:00am EST

FAIRtool.org, Serverless workflows for cubesats, Geoweaver ML workflow management, 3D printed weather stations
Come hear what ESIP Lab PIs have built over the past year. Speakers include:

Abdullah Alowairdhi: FAIRTool Project Update
Ziheng Sun: Geoweaver Project
Amanda Tan: Serverless Workflow Project
Agbeli Ameko: 3D-Printed Weather Stations

Presentations:
https://doi.org/10.6084/m9.figshare.11626284.v1

View Recording: https://youtu.be/vrRwEQRAIZ4

Takeaways



Speakers
avatar for Amanda Tan

Amanda Tan

Data Scientist, University of Washington
Cloud computing, distributed systems
avatar for Abdullah Alowairdhi

Abdullah Alowairdhi

PhD Candedate, U of Idaho
avatar for Ziheng Sun

Ziheng Sun

Research Assistant Professor, George Mason University
My research interests are mainly on geospatial cyberinfrastructure and agricultural remote sensing.
avatar for Annie Burgess

Annie Burgess

ESIP Lab Director, ESIP


Wednesday January 8, 2020 11:00am - 12:30pm EST
Salon A-C
  Salon A-C, Breakout
  • Skill Level Skim the Surface, Jump In
  • Keywords Cloud Computing, Machine Learning
  • Collaboration Area Tags Science Software
  • Remote Participation Link: https://global.gotomeeting.com/join/195545333
  • Remote Participation Phone #: (571) 317-3129
  • Remote Participation Access Code 195-545-333
  • Additional Phone #'s: Australia: +61 2 8355 1050 Austria: +43 7 2081 5427 Belgium: +32 28 93 7018 Canada: +1 (647) 497-9391 Denmark: +45 32 72 03 82 Finland: +358 923 17 0568 France: +33 170 950 594 Germany: +49 692 5736 7317 Ireland: +353 15 360 728 Italy: +39 0 230 57 81 42 Netherlands: +31 207 941 377 New Zealand: +64 9 280 6302 Norway: +47 21 93 37 51 Spain: +34 912 71 8491 Sweden: +46 853 527 836 Switzerland: +41 225 4599 78 United Kingdom: +44 330 221 0088

11:00am EST

Earth Observation Process and Application Discovery, Machine Learning, and Federated Cloud Analytics: Putting data to work using OGC Standards
This session provides an overview of the results from the recent OGC Research & Development initiative Testbed-15. The 9-months 5M USD initiative addressed six different topics, Earth Observation Process and Application Discovery, Machine Learning, Federated Cloud Analytics, Open Portrayal Framework, Delta Updates, and Data Centric Security. This session focuses on the results produced by the first three.

Earth Observation Process and Application Discovery developed draft specifications and models for discovery of cloud-provided process and applications. This was achieved by extending existing standards with process and application specific extensions. Now, data processing software can be made available as a service, discovered using catalog interfaces, and executed on demand by customers. This allows to locate the process execution physically close to the data and reduces data transport overheads.

The Machine Learning research developed models in the areas of earth observation data processing, image classification, feature extraction and segmentation, vector attribution, discovery and cataloguing, forest inventory management & optimization, and semantic web-link building and triple generation. Both model discovery and access took place through standardized interfaces.

The Federated Cloud Analytics research analysed how to handle data and processing capacities that are provided by individual cloud environments transparently to the user. The research included how federated membership, resource, and access policy management can be provided within a security environment, while also providing portability and interoperability to all stakeholders. Additionally, the initiative conducted a study of the application of Distributed Ledger Technologies (DLTs), and more specifically Blockchains, for managing provenance information in Federated Cloud.

The other three topics will be briefly introduced in addition. The Open Portrayal Framework provides a fully interoperable portrayal and styling suite of standards. Here, the initiative developed new OGC APIs for styles, maps, images, and tiles. Delta updates explored incremental updates and thus reduced communication payloads between clients and servers, whereas the Data Centric Security thread examined the use of encrypted container formats on standard metadata bindings. How to Prepare for this Session: Al results will be made available as public Engineering Reports that provide full details. These become stepwise available at http://docs.opengeospatial.org/per/

Presentations:
https://doi.org/10.6084/m9.figshare.11551563.v1

View Recording: https://youtu.be/ojMrcIE-SgE

Takeaways
  • OGC innovation program: Test fitness for purpose of geospatial community initiatives. TESTBED-15 concluded last November results available soon from document repository. End to end cloud pipeline for data processing and analytics. Call for TESTBED-16 due Feb 9th 2020! 1.6M in funding available. Three major threads: earth observation clouds, data integration and analytics, and modeling and packaging. 
  • Way to synergize between needs of user communities competing and collaborating projects, contributing to a more interoperable world. Provides applications, process and catalogues for data processing. 
  • Testbeds center around an exploitation/processing platform (for data with relevant applications) like an application market with cloud services. Having some trouble finding application developers. Finding web services with relevant data can be problematic.



Speakers
avatar for Ingo Simonis

Ingo Simonis

Director Innovation Programs & Science, OGC
Dr. Ingo Simonis is director of interoperability programs and science at the Open Geospatial Consortium (OGC), an international consortium of more than 525 companies, government agencies, research organizations, and universities participating in a consensus process to develop publicly... Read More →


Wednesday January 8, 2020 11:00am - 12:30pm EST
White Flint
  White Flint, Breakout

2:00pm EST

AI for Augmenting Geospatial Information Discovery
Thanks to the rapid developments of hardware and computer science, we have seen a lot of exciting breakthroughs in self driving, voice recognition, street view recognition, cancer detection, check deposit, etc. Sooner or later the fire of AI will burn in Earth science field. Scientists need high-level automation to discover in-time accurate geospatial information from big amount of Earth observations, but few of the existing algorithms can ideally solve the sophisticated problems within automation. However, nowadays the transition from manual to automatic is actually undergoing gradually, a bit by a bit. Many early-bird researchers have started to transplant the AI theory and algorithms from computer science to GIScience, and a number of promising results have been achieved. In this session, we will invite speakers to talk about their experiences of using AI in geospatial information (GI) discovery. We will discuss all aspects of "AI for GI" such as the algorithms, technical frameworks, used tools & libraries, and model evaluation in various individual use case scenarios. How to Prepare for this Session: https://esip.figshare.com/articles/Geoweaver_for_Better_Deep_Learning_A_Review_of_Cyberinfrastructure/9037091
https://esip.figshare.com/articles/Some_Basics_of_Deep_Learning_in_Agriculture/7631615

Presentations:
https://doi.org/10.6084/m9.figshare.11626299.v1

View Recording: https://youtu.be/W0q8WiMw9Hs

Takeaways
  • There is a significant uptake of machine learning/artificial intelligence for earth science applications in the recent decade;
  • The challenge of machine learning applications for earth science domain includes:
    • the quality and availability of training data sets;
    • Requires a team with diverse skill background to implement the application
    • Need better understanding of the underlying mechanism of ML/AI models
  • There are many promising applications/ developments on streamlining the process and application of machine learning applications for different sectors of the society (weather monitoring, emergency responses, social good)



Speakers
avatar for Yuhan (Douglas) Rao

Yuhan (Douglas) Rao

Postdoctoral Research Scholar, CISESS/NCICS/NCSU
avatar for Aimee Barciauskas

Aimee Barciauskas

Data engineer, Development Seed
avatar for Annie Burgess

Annie Burgess

ESIP Lab Director, ESIP
avatar for Rahul Ramachandran

Rahul Ramachandran

Project Manager, Sr. Research Scientist, NASA
avatar for Ziheng Sun

Ziheng Sun

Research Assistant Professor, George Mason University
My research interests are mainly on geospatial cyberinfrastructure and agricultural remote sensing.


Wednesday January 8, 2020 2:00pm - 3:30pm EST
Salon A-C
  Salon A-C, Breakout

4:00pm EST

Emerging EnviroSensing Topics: Long-range, Low-power, Non-contact, Open-source Sensor Networks
Led by the ESIP EnviroSensing Cluster, this session is open to scientists, information managers, and technologists interested in the general topic of environmental sensing for science and management.

Rapid advances and decreasing costs in technology, as applied to environmental sensing systems, are promoting a shift from sparsely-distributed, single-mission observations toward employing affordable, high-fidelity, ecosystem monitoring networks driven by a need to forecast outcomes across timescales. In this session we will hear talks on new approaches to standing up long-range, low-power monitoring networks; the value(s) added by non-contact sensing (local-remote to satellite based sensing); as well as innovative sensor developments, including open-source approaches, that promote connectivity. The session will conclude with a 20-minute topical discussion open to all in attendance. How to Prepare for this Session:

List of speakers and presentation titles for this session:
  • Jacqueline Le Moigne: NASA
    Future Earth Science Measurements Using New Observing Strategies
  • David Coyle: USGS
    USGS NGWOS LPWAN Experiment: Leveraging LoRaWAN Sensor Platform Technologies
  • James Gallagher: OPeNDAP
    Sensors in Snowy Alpine Environments: Sensor Networks with LoRa, Progress Report
    View Slides: https://doi.org/10.6084/m9.figshare.11555784.v1 
  • Daniel Fuka: Va Tech
    Making Drones Interesting Again
    View Slides: https://doi.org/10.6084/m9.figshare.11663718.v1
  • Joseph Bell: USGS
    Deep-dive discussion after presentations. A topic of interest is documenting test efforts and the publication of peer-reviewed Test Reports

View Recording: https://youtu.be/dXTLqt-5Ai8

Takeaways
  • As monitoring expands across agencies and from point measures on the surface of the earth to monitoring using networks of satellites in space (internet of space) there is a growing need to increase communication among agencies and instrumentation alike
  • Inexpensive monitoring equipment is becoming readily available with large gains being made in the areas of function, reliability, and resolution/accuracy.
    • Market disruption
    • Edge -Computing (is this the current form of SDI-12-style monitoring?) local processing and storage, transmission of small/tiny data payloads
  • There appears to be a need across disciplines and agencies for a peer-reviewed test reports
    • Not resource intensive to publish
    • Available to all users (FAIR)
    • Provides details on test plan and provides test data whenever applicable.


Speakers
avatar for Joseph Bell

Joseph Bell

Hydrologist, USGS


Wednesday January 8, 2020 4:00pm - 5:30pm EST
Forest Glen
  Forest Glen, Breakout
 
Thursday, January 9
 

10:15am EST

Connecting Data with Data Usage: a Graph Approach
We will investigate graph-based methods of connecting data with the uses made and the knowledge gained from those data, from science research to applications to strategic planning. We will examine the diverse capabilities enabled by connecting uses with data for a variety of stakeholders, and explore how to connect existing knowledge graphs together to scale out across the ESIP federation and related communities toward an inter-connected mega-graph.

0-5 min: Chris Lynnes (NASA): Documenting how data matters...
5-15 min: Doug Newman (NASA): EOSDIS Knowledge Graph
https://doi.org/10.6084/m9.figshare.11561805.v1
15-25 min: Reid Sherman (GCIS): Global Change Information System
https://doi.org/10.6084/m9.figshare.11560011.v1
25-35 min: Dave Blodgett (USGS): SELFIE
https://doi.org/10.6084/m9.figshare.11559093.v1
35-45 min: Joe Conran (NOAA): Interagency Coordination of Satellite Needs
https://doi.org/10.6084/m9.figshare.11561946.v1
45-55 min: Wil Doane (IDA): Assessing the Impact of Land Imaging
https://doi.org/10.6084/m9.figshare.11561913.v1
55-90 min: The Way Forward:
1 - Got Use Case?
2 - ESIP Cluster? https://www.esipfed.org/get-involved/collaborate
3 - Who's In?

Session Notes

View Recording:
https://youtu.be/yi05crW6Ya0\

Takeaways
  • How to connect data with the uses of that data = Documenting how data matter.
    Federating knowledge bases is daunting task but possible.
  • Connect research and data to place (but gap around using place identifiers in linked data).
    Discussion of potentially make a new cluster or using another one. Decision to recharter/repurpose/rename the data discovery cluster.
  • Sin of computer science is giving people impression that things are mostly 1 to 1 relationship, but more accurately life and universe is full of many-to-many relationships, i.e., graph databases > RDBMS




Speakers
avatar for Christopher Lynnes

Christopher Lynnes

Systems Architect, NASA/EOSDIS, NASA/GSFC
Christopher Lynnes is currently System Architect for NASA’s Earth Observing System Data and Information System, known as EOSDIS. He has been working on EOSDIS since 1992, over which time he has worked multiple generations of data archive systems, search engines and interfaces, science... Read More →
avatar for Doug Newman

Doug Newman

EED Data Use Architect


Thursday January 9, 2020 10:15am - 11:45am EST
White Flint
  White Flint, Panel

12:00pm EST

Datacubes for Analysis-Ready Data: Standards & State of the Art
This workshop session will follow up on the OGC Coverage Analytics sprint, focusing specifically on advanced services for spatio-temporal datacubes. In the Earth sciences datacubes are accepted as an enabling paradigm for offering massive spatio-temporal Earth data analysis-ready, more generally: easing access, extraction, analysis, and fusion. Also, datacubes homogenizes APIs across dimensions, allowing unified wrangling of 1-D sensor data, 2-D imagery, 3-D x/y/t image timeseries and x/y/z geophysics voxel data, and 4-D x/y/z/t climate and weather data.
Based on the OGC datacube reference implementation we introduce datacube concepts, state of standardization, and real-life 2D, 3D, and 4D examples utilizing services from three continents. Ample time will be available for discussion, and Internet-connected participants will be able to replay and modify many of the examples shown. Further, key datacube activities worldwide, within and beyond Earth sciences, will be related to.
Session outcomes could take a number of forms: ideas and issues for OGC, ISO, or ESIP to consider; example use cases; challenges not yet addressed sufficiently, and entirely novel use cases; work and collaboration plans for future ESIP work. Outcomes of the session will be reported at the next OGC TC meeting's Big Data and Coverage sessions. How to Prepare for this Session: Introductory and advanced material is available from http://myogc.org/go/coveragesDWG

Presentations
https://doi.org/10.6084/m9.figshare.11562552.v1

View Recording: https://youtu.be/82WG7soc5bk

Takeaways
  • Abstract coverage construct defines the base which can be filled up with a coverage implementation schema. Important as previously implementation wasn’t interoperable with different servers and clients. 
  • Have embedded the coordinate system retrieved from sensors reporting in real time into their xml schema to be able to integrate the sensor data into the broader system. Can deliver the data in addition to GML but JSON, and RDF which could be used to link into semantic web tech. 
  • Principle is send HTTP url-encoded query to server and get some results that are extracted from datacube, e.g., sources from many hyperspectral images.

Speakers

Thursday January 9, 2020 12:00pm - 1:30pm EST
White Flint