Open Data API


Welcome to the KNMI Data Platform (KDP) Open Data API. This guide will walk you through the functionalities of our API, how to use it effectively, and provide you with practical examples to get you started. Each dataset available in the Open Data API consists of files. The Open Data API is a file-based API that allows users to list and query across files in datasets and create download links for a file in a dataset. The available datasets can be found on the KNMI Data Platform. All endpoints are protected and can be accessed using a valid API key, passed in the Authorization header of the HTTP request.


More information on:

Obtaining an API token

Requesting an API token

In order to make API requests, you’ll need an API key for the Open Data API. You can request an API key for the Open Data API in the API Catalog. To request an API key, you will need to register for an account. You can do this by clicking the “Register” button in the top right corner of the page. Once registered, you can request an API key by clicking on the appropriate “Request API Key” button on the API Catalog page. You will receive an email with your API key.

The table below lists the rate limits and quotas for the API keys.

Registered
Access Rate Limit Quota
Open Data API 200 requests per second 1000 requests per hour


Anonymous key

Anonymous keys provide unregistered access to Open Data. To ensure fair usage and to be able to control the operational costs of KDP, we limit the number of API calls per period. The table below lists the rate limits and quotas for the anonymous keys. Note that you share these limits with all other active users of this anonymous API key.

Each anonymous key has a limited lifespan, as indicated by its corresponding expiry date. We update this page once a key is about to expire.

Anonymous
API Key Access Rate Limit Quota
Available till 1 July 2024
eyJvcmciOiI1ZTU1NGUxOTI3NGE5NjAwMDEyYTNlYjEiLCJpZCI6ImE1OGI5NGZmMDY5NDRhZDNhZjFkMDBmNDBmNTQyNjBkIiwiaCI6Im11cm11cjEyOCJ9
Open Data 50 requests per minute 3000 requests per hour


Requesting a bulk-key

Downloading a complete dataset can take quite some time due to the rate limits and quotas associated with regular API keys. Therefore, we recommend you to contact us such that we can request a dedicated bulk-download API key for you account.

Specify the following information in your request:

Subject: KDP Complete Dataset Download.
Dataset name: <dataset name>
<dataset name> is the name of the dataset that you want to download. For example: EV24/2, where EV24 is the dataset name, and 2 is the dataset version.

A bulk-download API key comes with a rate-limit and quota tailored for the dataset you want to download. This makes it faster and easier to download a complete dataset.

You must use the same email address as the one used to register for our Developer Portal. We use the sender’s email address to request a new bulk-download API key for the Tyk Developer that is registered with that email.

If we approve your dataset download request, you receive a bulk-download API key for the requested dataset. In addition, we send an email on behalf of opendata@knmi.nl that we approved your key request. We send the key to the email you used to register your Tyk Developer account. If you can’t find the key in your inbox, make sure it is not marked as spam.

Because this new API key belongs to your registered account, you must be a registered Tyk Developer before we can request an API key for you.

Below, there is an example Python 3 script that shows you how to download a complete dataset efficiently.

How to use the Open Data API Choose a Dataset

The first step is to choose a dataset that suits your needs. You can find all available datasets on the KNMI Data Platform. Each dataset has a unique name and version number. The correct dataset name and version can be found in the metadata table under Dataset name and Datast version. The dataset name and version are used in the API calls to identify the dataset you want to query. A simple way to get the correct query URL is to look in the Access tab of the dataset.

Make API Calls

In order to authenticate your API calls, you need to add your API key to the Authorization header of the HTTP request.

  • To list files within a dataset, construct an API call using the endpoint:
  • https://api.dataplatform.knmi.nl/open-data/v1/datasets/{datasetName}/versions/{versionId}/files
  • To obtain a download URL for a specific file, use this endpoint:
  • https://api.dataplatform.knmi.nl/open-data/v1/datasets/{datasetName}/versions/{versionId}/files/{filename}/url
The full documentation of these API endpoints can be found on the Technical Documentation (Swagger) page.

Deprecation

Similar to any software, features undergo a natural evolution over time. The key X-KNMI-Deprecation in the header of the response notifies the end of a feature, an old API version, or a deprecated dataset. This key is only present when a deprecation is applicable. Our deprecation policy explains this in further detail. An example script that shows how the use this deprecation header can be found here.

Curl examples List the first 20 files in the Actuele10mindataKNMIstations dataset:

curl --location --request GET \
    "https://api.dataplatform.knmi.nl/open-data/v1/datasets/Actuele10mindataKNMIstations/versions/2/files" \
    --header "Authorization: <API_KEY>"

List the first 15 files in Actuele10mindataKNMIstations dataset

curl --location --request GET -G \
    "https://api.dataplatform.knmi.nl/open-data/v1/datasets/Actuele10mindataKNMIstations/versions/2/files" \
    -d maxKeys=15 \
    -d sorting=asc \
    --header "Authorization: <API_KEY>"

The default sort direction is ascending.

List the last 10 files in Actuele10mindataKNMIstations dataset

curl --location --request GET -G \
    "https://api.dataplatform.knmi.nl/open-data/v1/datasets/Actuele10mindataKNMIstations/versions/2/files" \
    -d sorting=desc \
    --header "Authorization: <API_KEY>"

List the last 10 files in order of the last modified date in Actuele10mindataKNMIstations dataset

curl --location --request GET -G \
    "https://api.dataplatform.knmi.nl/open-data/v1/datasets/Actuele10mindataKNMIstations/versions/2/files" \
    -d sorting=desc \
    -d order_by=lastModified \
    --header "Authorization: <API_KEY>"

The default attribute the list is ordered by is filename. Other allowed values are lastModified and created.

List the first 15 files ordered alphabetically after a specific filename in Actuele10mindataKNMIstations dataset

curl --location --request GET -G \
    "https://api.dataplatform.knmi.nl/open-data/v1/datasets/Actuele10mindataKNMIstations/versions/2/files" \
    -d maxKeys=15 \
    -d order_by=filename \
    -d begin=KMDS__OPER_P___10M_OBS_L2_202007162330.nc \
    --header "Authorization: <API_KEY>"

Python example: Listing the last file of a dataset and retrieve it

import logging
import os
import sys

import requests

logging.basicConfig()
logger = logging.getLogger(__name__)
logger.setLevel(os.environ.get("LOG_LEVEL", logging.INFO))


class OpenDataAPI:
    def __init__(self, api_token: str):
        self.base_url = "https://api.dataplatform.knmi.nl/open-data/v1"
        self.headers = {"Authorization": api_token}

    def __get_data(self, url, params=None):
        return requests.get(url, headers=self.headers, params=params).json()

    def list_files(self, dataset_name: str, dataset_version: str, params: dict):
        return self.__get_data(
            f"{self.base_url}/datasets/{dataset_name}/versions/{dataset_version}/files",
            params=params,
        )

    def get_file_url(self, dataset_name: str, dataset_version: str, file_name: str):
        return self.__get_data(
            f"{self.base_url}/datasets/{dataset_name}/versions/{dataset_version}/files/{file_name}/url"
        )


def download_file_from_temporary_download_url(download_url, filename):
    try:
        with requests.get(download_url, stream=True) as r:
            r.raise_for_status()
            with open(filename, "wb") as f:
                for chunk in r.iter_content(chunk_size=8192):
                    f.write(chunk)
    except Exception:
        logger.exception("Unable to download file using download URL")
        sys.exit(1)

    logger.info(f"Successfully downloaded dataset file to {filename}")


def main():
    api_key = "<API_KEY>"
    dataset_name = "Actuele10mindataKNMIstations"
    dataset_version = "2"
    logger.info(f"Fetching latest file of {dataset_name} version {dataset_version}")

    api = OpenDataAPI(api_token=api_key)

    # sort the files in descending order and only retrieve the first file
    params = {"maxKeys": 1, "orderBy": "created", "sorting": "desc"}
    response = api.list_files(dataset_name, dataset_version, params)
    if "error" in response:
        logger.error(f"Unable to retrieve list of files: {response['error']}")
        sys.exit(1)

    latest_file = response["files"][0].get("filename")
    logger.info(f"Latest file is: {latest_file}")

    # fetch the download url and download the file
    response = api.get_file_url(dataset_name, dataset_version, latest_file)
    download_file_from_temporary_download_url(response["temporaryDownloadUrl"], latest_file)


if __name__ == "__main__":
    main()

Python example: Listing the first 10 files of today and retrieving the first one

import logging
import sys
from datetime import datetime
from datetime import timezone

import requests

logging.basicConfig()
logger = logging.getLogger(__name__)
logger.setLevel("INFO")


class OpenDataAPI:
    def __init__(self, api_token: str):
        self.base_url = "https://api.dataplatform.knmi.nl/open-data/v1"
        self.headers = {"Authorization": api_token}

    def __get_data(self, url, params=None):
        return requests.get(url, headers=self.headers, params=params).json()

    def list_files(self, dataset_name: str, dataset_version: str, params: dict):
        return self.__get_data(
            f"{self.base_url}/datasets/{dataset_name}/versions/{dataset_version}/files",
            params=params,
        )

    def get_file_url(self, dataset_name: str, dataset_version: str, file_name: str):
        return self.__get_data(
            f"{self.base_url}/datasets/{dataset_name}/versions/{dataset_version}/files/{file_name}/url"
        )


def download_file_from_temporary_download_url(download_url, filename):
    try:
        with requests.get(download_url, stream=True) as r:
            r.raise_for_status()
            with open(filename, "wb") as f:
                for chunk in r.iter_content(chunk_size=8192):
                    f.write(chunk)
    except Exception:
        logger.exception("Unable to download file using download URL")
        sys.exit(1)

    logger.info(f"Successfully downloaded dataset file to {filename}")


def main():
    api_key = "<API_KEY>"
    dataset_name = "Actuele10mindataKNMIstations"
    dataset_version = "2"

    api = OpenDataAPI(api_token=api_key)

    timestamp = datetime.now(timezone.utc).date().strftime("%Y-%m-%dT%H:%M:%S+00:00")
    logger.info(f"Fetching first file of {dataset_name} version {dataset_version} on {timestamp}")

    # order the files by creation date and begin listing after the specified timestamp
    params = {"orderBy": "created", "begin": timestamp}
    response = api.list_files(dataset_name, dataset_version, params)
    if "error" in response:
        logger.error(f"Unable to retrieve list of files: {response['error']}")
        sys.exit(1)

    file_name = response["files"][0].get("filename")
    logger.info(f"First file of {timestamp} is: {file_name}")

    # fetch the download url and download the file
    response = api.get_file_url(dataset_name, dataset_version, file_name)
    download_file_from_temporary_download_url(response["temporaryDownloadUrl"], file_name)


if __name__ == "__main__":
    main()

Python example: Retrieving the file from one hour ago and logging deprecation

import logging
import os
import sys
from datetime import datetime
from datetime import timedelta

import requests

logging.basicConfig()
logger = logging.getLogger(__name__)
logger.setLevel(os.environ.get("LOG_LEVEL", logging.INFO))

api_url = "https://api.dataplatform.knmi.nl/open-data"
api_version = "v1"


def main():
    # Parameters
    api_key = "<API_KEY>"
    dataset_name = "Actuele10mindataKNMIstations"
    dataset_version = "2"

    # Use get file to retrieve a file from one hour ago.
    # Filename format for this dataset equals KMDS__OPER_P___10M_OBS_L2_YYYYMMDDHHMM.nc,
    # where the minutes are increased in steps of 10.
    timestamp_now = datetime.utcnow()
    timestamp_one_hour_ago = timestamp_now - timedelta(hours=1) - timedelta(minutes=timestamp_now.minute % 10)
    filename = f"KMDS__OPER_P___10M_OBS_L2_{timestamp_one_hour_ago.strftime('%Y%m%d%H%M')}.nc"

    logger.debug(f"Current time: {timestamp_now}")
    logger.debug(f"One hour ago: {timestamp_one_hour_ago}")
    logger.debug(f"Dataset file to download: {filename}")

    endpoint = f"{api_url}/{api_version}/datasets/{dataset_name}/versions/{dataset_version}/files/{filename}/url"
    get_file_response = requests.get(endpoint, headers={"Authorization": api_key})

    if get_file_response.status_code != 200:
        logger.error("Unable to retrieve download url for file")
        logger.error(get_file_response.text)
        sys.exit(1)

    logger.info(f"Successfully retrieved temporary download URL for dataset file {filename}")

    download_url = get_file_response.json().get("temporaryDownloadUrl")
    # Check logging for deprecation
    if "X-KNMI-Deprecation" in get_file_response.headers:
        deprecation_message = get_file_response.headers.get("X-KNMI-Deprecation")
        logger.warning(f"Deprecation message: {deprecation_message}")

    download_file_from_temporary_download_url(download_url, filename)


def download_file_from_temporary_download_url(download_url, filename):
    try:
        with requests.get(download_url, stream=True) as r:
            r.raise_for_status()
            with open(filename, "wb") as f:
                for chunk in r.iter_content(chunk_size=8192):
                    f.write(chunk)
    except Exception:
        logger.exception("Unable to download file using download URL")
        sys.exit(1)

    logger.info(f"Successfully downloaded dataset file to {filename}")


if __name__ == "__main__":
    main()

Python example: Download a full dataset To download a whole dataset you need to request a bulk API key. This is described in the Obtaining an API token tab.
Once you have obtained a dedicated API key to download a complete dataset, you are ready to download the corresponding dataset files.
To retrieve these files efficiently, we provide an example script. This script shows how to download the complete EV24/2 dataset. The structure of this script is the same regardless the dataset you want to download.
Make sure to change download_directory to an existing empty directory.

import asyncio
import logging
import os
from concurrent.futures import ThreadPoolExecutor
from pathlib import Path
from typing import Any
from typing import Dict
from typing import List
from typing import Tuple

import requests
from requests import Session

logging.basicConfig()
logger = logging.getLogger(__name__)
logger.setLevel(os.environ.get("LOG_LEVEL", logging.INFO))


def download_dataset_file(
    session: Session,
    base_url: str,
    dataset_name: str,
    dataset_version: str,
    filename: str,
    directory: str,
    overwrite: bool,
) -> Tuple[bool, str]:
    # if a file from this dataset already exists, skip downloading it.
    file_path = Path(directory, filename).resolve()
    if not overwrite and file_path.exists():
        logger.info(f"Dataset file '{filename}' was already downloaded.")
        return True, filename

    endpoint = f"{base_url}/datasets/{dataset_name}/versions/{dataset_version}/files/{filename}/url"
    get_file_response = session.get(endpoint)

    # retrieve download URL for dataset file
    if get_file_response.status_code != 200:
        logger.warning(f"Unable to get file: {filename}")
        logger.warning(get_file_response.content)
        return False, filename

    # use download URL to GET dataset file. We don't need to set the 'Authorization' header,
    # The presigned download URL already has permissions to GET the file contents
    download_url = get_file_response.json().get("temporaryDownloadUrl")
    return download_file_from_temporary_download_url(download_url, directory, filename)


def download_file_from_temporary_download_url(download_url, directory, filename):
    try:
        with requests.get(download_url, stream=True) as r:
            r.raise_for_status()
            with open(f"{directory}/{filename}", "wb") as f:
                for chunk in r.iter_content(chunk_size=8192):
                    f.write(chunk)
    except Exception:
        logger.exception("Unable to download file using download URL")
        return False, filename

    logger.info(f"Downloaded dataset file '{filename}'")
    return True, filename


def list_dataset_files(
    session: Session,
    base_url: str,
    dataset_name: str,
    dataset_version: str,
    params: Dict[str, str],
) -> Tuple[List[str], Dict[str, Any]]:
    logger.info(f"Retrieve dataset files with query params: {params}")

    list_files_endpoint = f"{base_url}/datasets/{dataset_name}/versions/{dataset_version}/files"
    list_files_response = session.get(list_files_endpoint, params=params)

    if list_files_response.status_code != 200:
        raise Exception("Unable to list initial dataset files")

    try:
        list_files_response_json = list_files_response.json()
        dataset_files = list_files_response_json.get("files")
        dataset_filenames = list(map(lambda x: x.get("filename"), dataset_files))
        return dataset_filenames, list_files_response_json
    except Exception as e:
        logger.exception(e)
        raise Exception(e)


def get_max_worker_count(filesizes):
    size_for_threading = 10_000_000  # 10 MB
    average = sum(filesizes) / len(filesizes)
    # to prevent downloading multiple half files in case of a network failure with big files
    if average > size_for_threading:
        threads = 1
    else:
        threads = 10
    return threads


async def main():
    api_key = "<API_KEY>"
    dataset_name = "EV24"
    dataset_version = "2"
    base_url = "https://api.dataplatform.knmi.nl/open-data/v1"
    # When set to True, if a file with the same name exists the output is written over the file.
    # To prevent unnecessary bandwidth usage, leave it set to False.
    overwrite = False

    download_directory = "./dataset-download"

    # Make sure to send the API key with every HTTP request
    session = requests.Session()
    session.headers.update({"Authorization": api_key})

    # Verify that the download directory exists
    if not Path(download_directory).is_dir() or not Path(download_directory).exists():
        raise Exception(f"Invalid or non-existing directory: {download_directory}")

    filenames = []
    max_keys = 500
    next_page_token = None
    file_sizes = []
    # Use the API to get a list of all dataset filenames
    while True:
        # Retrieve dataset files after given filename
        dataset_filenames, response_json = list_dataset_files(
            session,
            base_url,
            dataset_name,
            dataset_version,
            {"maxKeys": f"{max_keys}", "nextPageToken": next_page_token},
        )
        file_sizes.extend(file["size"] for file in response_json.get("files"))
        # Store filenames
        filenames += dataset_filenames

        # If the result is not truncated, we retrieved all filenames
        next_page_token = response_json.get("nextPageToken")
        if not next_page_token:
            logger.info("Retrieved names of all dataset files")
            break

    logger.info(f"Number of files to download: {len(filenames)}")

    worker_count = get_max_worker_count(file_sizes)
    loop = asyncio.get_event_loop()

    # Allow up to 10 separate threads to download dataset files concurrently
    executor = ThreadPoolExecutor(max_workers=worker_count)
    futures = []

    # Create tasks that download the dataset files
    for dataset_filename in filenames:
        # Create future for dataset file
        future = loop.run_in_executor(
            executor,
            download_dataset_file,
            session,
            base_url,
            dataset_name,
            dataset_version,
            dataset_filename,
            download_directory,
            overwrite,
        )
        futures.append(future)

    # # Wait for all tasks to complete and gather the results
    future_results = await asyncio.gather(*futures)
    logger.info(f"Finished '{dataset_name}' dataset download")

    failed_downloads = list(filter(lambda x: not x[0], future_results))

    if len(failed_downloads) > 0:
        logger.warning("Failed to download the following dataset files:")
        logger.warning(list(map(lambda x: x[1], failed_downloads)))


if __name__ == "__main__":
    asyncio.run(main())

Navigation