From 84ff718ed2f04b79a2ba0420978888e8395598df Mon Sep 17 00:00:00 2001 From: Roelof Rietbroek Date: Tue, 15 Jun 2021 11:55:18 +0200 Subject: [PATCH] Added Challenge I and .gitignore --- .gitignore | 1 + Challenge1_HookingupWithDias.ipynb | 106 +++++++++++++++++++++++++++++ 2 files changed, 107 insertions(+) create mode 100644 .gitignore create mode 100644 Challenge1_HookingupWithDias.ipynb diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..763513e --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +.ipynb_checkpoints diff --git a/Challenge1_HookingupWithDias.ipynb b/Challenge1_HookingupWithDias.ipynb new file mode 100644 index 0000000..7d62632 --- /dev/null +++ b/Challenge1_HookingupWithDias.ipynb @@ -0,0 +1,106 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "3b5fe448-682e-4168-9f0f-ddd4e147b59d", + "metadata": {}, + "source": [ + "# Challenge I: Search and process data from DIAS (Copernicus Data and Information Access Services)\n", + "*WRS jupyter hackathon, 30th of June 2021*, R. Rietbroek\n", + "\n", + "The overall goal of this challenge is threefold:\n", + "1. Learn how to establish a connection with a DIAS server and execute a search request\n", + "2. Download a dataset\n", + "3. Apply a post-processing operation on such a dataset\n", + "\n", + "DIAS currently consists of 5 different data providers, which hosts a variety of data and provide access to it:\n", + "\n", + "\n", + " \n", + "\n", + "\n", + "\n", + " \n", + "\n", + "\n", + "\n", + " \n", + "\n", + "\n", + "\n", + " \n", + "\n", + "\n", + "\n", + " \n", + "\n", + "\n", + "The different providers generally provide a graphical website to search for and access data. However, they also provide scriptable ways which allow automated access (so-called [REST-API's](https://en.wikipedia.org/wiki/Representational_state_transfer) in webserver speak). In essence, these API's allow the server to be passed additional arguments (append by using `?parameter=Value1¶meter2=Value2` which can be interpreted and processed by the server and a response can be send back. For example [https://catalogue.onda-dias.eu/opensearch/OpenSearch?instrumentShortName=MSI](https://catalogue.onda-dias.eu/opensearch/OpenSearch?instrumentShortName=MSI), will return a machine readable xml document containing the search hits for the term `MSI`. \n", + "\n", + "In this challenge, you will try to make use of the automated way to search for and access the data, in order to facilitate automated scripts. \n", + "\n", + "Unfortunately, the way to access the servers is not standardized and searches and download requests may take several forms depending on the provider. Luckily there is [progress on a python module called eodag](https://pypi.org/project/eodag/) which provides a more uniform way of accessing the the providers. You're encouraged to make use of that package." + ] + }, + { + "cell_type": "markdown", + "id": "ecc297bd-ce71-4099-8674-be4262138eda", + "metadata": {}, + "source": [ + "## Challenge statement\n", + "Supplement this notebook with functionality which (1) allow search queries to the datasets, (2) Downloading an appropriate subset of the data, (3) create a simple visualization of the downloaded data\n", + "\n", + "## Tips and tricks\n", + "* Try to find interesting datasets on the graphical webinterfaces of the servers, and see if those can also be found in a programmatic way.\n", + "* Try out with 'light' datasets first in order to avoid repeated downloading of large files. Try to see if *subsetting* is possible (downloading only parts of the dataset)\n", + "* There are many python code snippets in the crib's folder `public/resources/Python-Data-Science-Handbook` which may be of use\n", + "* From a security standpoint: Don't hardcode usernames and passwords in your jupyter notebook. For example, you can query for the user's input (see below) or use separate files which contain confidential information" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "34d416dd-a9bc-48e1-8cec-8d443a3373a0", + "metadata": {}, + "outputs": [ + { + "name": "stdin", + "output_type": "stream", + "text": [ + "Please enter username roelof\n", + "Please enter password ·····\n" + ] + } + ], + "source": [ + "#example how to query the user for sensitive information\n", + "from getpass import getpass\n", + "credentials={}\n", + "credentials[\"user\"]=input(\"Please enter username\")\n", + "credentials[\"pass\"]=getpass(\"Please enter password\")\n", + "# Note that this information is 'volatile': when the notebook shuts down the values of credentials are lost and not stored in the notebook" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.5" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}