Open energy system databases
Open energy system database projects employ open data methods to collect, clean, and republish energy-related datasets for open use. The resulting information is then available, given a suitable open license, for statistical analysis and for building numerical energy system models, including open energy system models. Permissive licenses like Creative Commons CC0 and are preferred, but some projects will house data made public under market transparency regulations and carrying unqualified copyright.
The databases themselves may furnish information on national power plant fleets, renewable generation assets, transmission networks, time series for electricity loads, dispatch, spot prices, and cross-border trades, weather information, and similar. They may also offer other energy statistics including fossil fuel imports and exports, gas, oil, and coal prices, emissions certificate prices, and information on energy efficiency costs and benefits.
Much of the data is sourced from official or semi-official agencies, including national statistics offices, transmission system operators, and electricity market operators. Data is also crowdsourced using public wikis and public upload facilities. Projects usually also maintain a strict record of the and version histories of the datasets they hold. Some projects, as part of their mandate, also try to persuade primary data providers to release their data under more liberal licensing conditions.
Two drivers favor the establishment of such databases. The first is a wish to reduce the duplication of effort that accompanies each new analytical project as it assembles and processes the data that it needs from primary sources. And the second is an increasing desire to make public policy energy models more transparent to improve their acceptance by policymakers and the public. Better transparency dictates the use of open information, able to be accessed and scrutinized by third-parties, in addition to releasing the source code for the models in question.
General considerations
Background
In the mid-1990s, energy models used structured text files for data interchange but efforts were being made to migrate to relational database management systems for data processing. These early efforts however remained local to a project and did not involve online publishing or open data principles.The first energy information portal to go live was [|OpenEI] in late 2009, followed by [|reegle] in 2011.
A 2012 paper marks the first scientific publication to advocate the crowdsourcing of energy data. The 2012 PhD thesis by Chris Davis also discusses the crowdsourcing of energy data in some depth. A 2016 thesis surveyed the spatial information requirements for energy planning and finds that most types of data, with the exception of energy expenditure data, are available but nonetheless remain scattered and poorly coordinated.
In terms of open data, a 2017 paper concludes that energy research has lagged behind other fields, most notably physics, biotechnology, and medicine. The paper also lists the benefits of open data and open models and discusses the reasons that many projects nonetheless remain closed. A one-page opinion piece from 2017 advances the case for using open energy data and modeling to build public trust in policy analysis. The article also argues that scientific journals have a responsibility to require that data and code be submitted alongside text for peer review.
Database design
are central to the design and organization of databases. Open energy database projects generally try to develop and adhere to well resolved data models, using defacto and published standards where applicable. Some projects attempt to coordinate their data models in order to harmonize their data and improve its utility. Defining and maintaining suitable metadata is also a key issue. The life-cycle management of data includes, but is not limited to, the use of version control to track the provenance of incoming and cleansed data. Some sites allow users to comment on and rate individual datasets.Dataset copyright and database rights
Issues surrounding copyright remain at the forefront with regard to open energy data. As noted, most energy datasets are collated and published by official or semi-official sources. But many of the publicly available energy datasets carry no license, limiting their reuse in numerical and statistical models, open or otherwise. Copyright protected material cannot lawfully be circulated, nor can it be modified and republished.Measures to enforce market transparency have not helped much because the associated information is again not licensed to enable modification and republication. Transparency measures include the 2013 European energy market transparency regulation 543/2013. Indeed, 543/2013 "is only an obligation to publish, not an obligation to license". Notwithstanding, 543/2013 does enable downloaded data to be computer processed with legal certainty.
Energy databases with hardware located with the European Union are protected under a general database law, irrespective of the legal status of the information they hold.
Database rights not waived by public sector providers significantly restrict the amount of data a user can lawfully access.
A December 2017 submission by energy researchers in Germany and [|elsewhere] highlighted a number of concerns over the re-use of public sector information within the Europe Union.
The submission drew heavily on a recent legal opinion covering electricity data.
Energy statistics
National and international energy statistics are published regularly by governments and international agencies, such as the IEA. In 2016 the United Nations issued guidelines for energy statistics. While the definitions and sectoral breakdowns are useful when defining models, the information provided is rarely in sufficient detail to enable its use in high-resolution energy system models.Published standards
There are few published standards covering the collection and structuring of high-resolution energy system data. The IEC Common Information Model defines data exchange protocols for low and high voltage electricity networks.Open energy system database projects
Energy system models are data intensive and normally require detailed information from a number of sources. Dedicated projects to collect, collate, document, and republish energy system datasets have arisen to service this need. Most database projects prefer open data, issued under free licenses, but some will accept datasets with proprietary licenses in the absence of other options.The OpenStreetMap project, which uses the Open Database License, contains geographic information about energy system components, including transmission lines. Wikipedia itself has a growing set of information related to national energy systems, including descriptions of individual power stations.
The following [|table] summarizes projects that specifically publish open energy system data. Some are general repositories while others are designed to interact with open energy system models in real-time.
Three of the projects listed work with linked open data, a method of publishing structured data on the web so that it can be networked and subject to semantic queries. The overarching concept is termed the semantic web. Technically, such projects support RESTful APIs, RDF, and the SPARQL query language. A 2012 paper reviews the use of LOD in the renewable energy domain.
Energy Research Data Portal for South Africa
Project | Energy Research Data Portal for South Africa |
Host | University of Cape Town |
Status | active |
Scope/type | countries in Africa |
Data license | Creative Commons license| preferred |
Website |
The Energy Research Data Portal for South Africa is being developed by the , University of Cape Town, Cape Town, South Africa. Coverage includes South Africa and certain other African countries where the Centre undertakes projects. The website uses the CKAN open source data portal software. A number of data formats are supported, including CSV and XLSX. The site also offers an API for automated downloads. , the portal contained 65datasets.
energydata.info
Project | energydata.org |
Host | World Bank Group |
Status | active |
Scope/type | includes visualization and analytics |
Code license | app-specific |
Data license | CC BY 4.0 preferred |
Website | |
Repository |
The energydata.info project from the World Bank Group, Washington, DC, USA is an energy database portal designed to support national development by improving public access to energy information. As well as sharing data, the platform also offers tools to visualize and analyze energy data. Although the World Bank Group has made available a number of dataset and apps, external users and organizations are encouraged to contribute. The concepts of open data and open source development are central to the project. energydata.info uses its own fork of the CKAN open source data portal as its web-based platform. The Creative Commons CC BY 4.0 license is preferred for data but other open licenses can be deployed. Users are also bound by the terms of use for the site.
, the database held 131datasets, the great majority related to developing countries. The datasets are tagged and can be easily filtered. A number of download formats, including GIS files, are supported: CSV, XLS, XLSX, ArcGIS, Esri, GeoJSON, KML, and SHP. Some datasets are also offered as HTML. Again, as of 2017, four apps are available. Some are web-based and run from a browser.
Enipedia
Project | Enipedia |
Host | Delft University of Technology |
Status | active |
Scope/type | global materials and energy |
Data license | ODbL |
Wiki |
The semantic wiki-site and database Enipedia lists energy systems data worldwide. Enipedia is maintained by the , Faculty of Technology, Policy and Management, Delft University of Technology, Delft, the Netherlands. A key tenet of Enipedia is that data displayed on the wiki is not trapped within the wiki, but can be extracted via SPARQL queries and used to populate new tools. Any programming environment that can download content from a URL can be used to obtain data. Enipedia went live in March 2011, judging by traffic figures quoted by Davis.
A 2010 study describes how community driven data collection, processing, curation, and sharing is revolutionizing the data needs of industrial ecology and energy system analysis. A 2012 chapter introduces a system of systems engineering perspective and outlines how agent-based models and crowdsourced data can contribute to the solving of global issues.
, the site has gone offline pending a move to the domain.
OpenEnergy Platform
Project | OpenEnergy Platform |
Host | |
Status | active |
Scope/type | model-oriented |
Data license | dataset-specific |
Website |
The OpenEnergy Platform is a collaborative versioned dataset repository for storing open energy system model datasets. A dataset is presumed to be in the form of a database table, together with metadata. Registered users can upload and download datasets manually using a web-interface or programmatically via an API using HTTP POST calls. Uploaded datasets are screened for integrity using deterministic rules and then subject to confirmation by a moderator. The use of versioning means that any prior state of the database can be accessed. Hence, the repository is specifically designed to interoperate with energy system models. The backend is a PostgreSQL object-relational database under subversion version control. Open source licenses are specific to each dataset. Unlike other database projects, users can download the current version of the entire PostgreSQL database or any previous version. Initial development is being led by the Reiner Lemoine Institute, Berlin, Germany.
Open Data Energy Networks
Project | Open Data Energy Networks |
Host | Réseau de Transport d'Électricité and others |
Status | active |
Scope/type | French energy system |
Data license | Licence Ouverte |
Metadata | French and English |
Website | |
Language | French with English translations |
The Open Data Energy Networks portal is run by eight partners, led by the French national transmission system operator Réseau de Transport d'Électricité. The portal was previously known as Open Data RTE. The site offers electricity system datasets under a Creative Commons compatible license, with metadata, an RSS feed for notifying updates, and an interface for submitting questions. of information obtained from the site can also register third-party URLs against specific datasets.
The portal uses the French Government Licence Ouverte license and this is explicitly compatible with the United Kingdom Open Government Licence, the Creative Commons license, and the Open Data Commons license.
The site hosts electricity, gas, and weather information related to France.
Open Power System Data
The Open Power System Data project seeks to characterize the German and western European power plant fleets, their associated transmission network, and related information and to make that data available to energy modelers and analysts. The platform was originally implemented by the University of Flensburg, DIW Berlin, the Technical University of Berlin, and the energy economics consultancy Neon Neue Energieökonomik, all from Germany. The first phase of the project, from August 2015 to July 2017, was funded by the Federal Ministry for Economic Affairs and Energy for. The project later received funding for a second phase, from January 2018 to December 2020, with ETH Zurich replacing Flensburg University as a partner.Developers collate and harmonize data from a range of government, regulatory, and industry sources throughout Europe. The website and the metadata utilize English, whereas the original material can be in any one of 24languages. Datasets follow the emerging frictionless data package standard being developed by Open Knowledge International. The website was launched on 28October 2016. , the project offers the following primary packages, for Germany and other European countries:
- details, including geolocation, of conventional power plants and renewable energy power plants
- aggregated generation capacity by technology and country
- hourly time series covering electrical load, day-ahead electricity spot prices, and wind and solar resources
- a script to filter and download NASA MERRA-2 satellite weather data
- electricity demand and self-generation time series for representative south German households
- simulated PV and wind generation capacity factor time series for Europe, generated by the Renewables.ninja project
In a 2019 publication, OPSD developers describe their design choices, implementation, and provisioning. Information integrity remains key, with each data package having traceable provenance, curation, and packing. From October 2018, each new or revised data package is assigned a unique DOI to ensure that external references to current and prior versions remain stable.
A number of published electricity market modeling analyses are based on OPSD data.
In 2017, the Open Power System Data project won the Schleswig-Holstein Open Science Award and the Germany Land of Ideas award.
OpenEI
Project | OpenEI |
Host | National Renewable Energy Laboratory |
Status | active |
Scope/type | US focus |
Data license | |
Website |
Open Energy Information is a collaborative website, run by the US government, providing open energy data to software developers, analysts, users, consumers, and policymakers. The platform is sponsored by the United States Department of Energy and is being developed by the National Renewable Energy Laboratory. OpenEI launched on 9December 2009. While much of its data is from US government sources, the platform is intended to be open and global in scope.
OpenEI provides two mechanisms for contributing structured information: a semantic wiki for collaboratively-managed resources and a dataset upload facility for contributor-controlled resources. US government data is distributed under a CC0 public domain dedication, whereas other contributors are free to select an open data license of their choice. Users can rate data using a five-star system, based on accessibility, adaptability, usefulness, and general quality. Individual datasets can be manually downloaded in an appropriate format, often as CSV files. Scripts for processing data can also be shared through the site. In order to build a community around the platform, a number of forums are offered covering energy system data and related topics.
Most of the data on OpenEI is exposed as linked open data . OpenEI also uses LOD methods to populate its definitions throughout the wiki with real-time connections to DBPedia, reegle, and Wikipedia.
OpenEI has been used to classify geothermal resources in the United States. And to publicize municipal utility rates, again within the US.
OpenGridMap
Project | OpenGridMap |
Host | Technical University of Munich |
Status | active |
Scope/type | electricity grid data worldwide |
Code license | proprietary copyright |
Data license | Creative Commons license| preferred |
Website | |
Web application | URL TBA |
Repository |
OpenGridMap employs crowdsourcing techniques to gather detailed data on electricity network components and then infer a realistic network structure using methods from statistics and graph theory. The scope of the project is worldwide and both distribution and transmission networks can be reverse engineered. The project is managed by the Chair of Business Information Systems, , Technical University of Munich, Munich, Germany. The project maintains a website and a Facebook page and provides an Android mobile app to help the public document electrical devices, such as transformers and substations. The bulk of the data is being made available under a Creative Commons license. The processing software is written primarily in Python and MATLAB and is hosted on GitHub.
OpenGridMap provides a tailored GIS web application, layered on OpenStreetMap, which contributors can use to upload and edit information directly. The same database automatically stores field recordings submitted by the mobile app. Subsequent classification by experts allows normal citizens to document and photograph electrical components and have them correctly identified. The project is experimenting with the use of hobby drones to obtain better information on associated facilities, such as photovoltaic installations. Transmission line data is also sourced from and shared with OpenStreetMap. Each component record is verified by a moderator.
Once sufficient data is available, the transnet software is run to produce a likely network, using statistical correlation, Voronoi partitioning, and minimum spanning tree algorithms. The resulting network can be exported in CSV, XML, and CIM formats. CIM models are well suited for translation into software-specific data formats for further analysis, including power grid simulation. Transnet also displays descriptive statistics about the resulting network for visual confirmation.
The project is motivated by the need to provide datasets for high-resolution energy system models, so that energy system transitions can be better managed, both technically and policy-wise. The rapid expansion of renewable generation and the anticipated uptake of electric vehicles means that electricity system models must increasingly represent distribution and transmission networks in some detail.
, OpenGridMap techniques have been used to estimate the low voltage network in the German city of Garching and to estimate the high voltage grids in several other countries.
reegle
Project | reegle |
Host | |
Status | active |
Scope/type | clean energy |
Data license | |
Website |
reegle is a clean energy information portal covering renewable energy, energy efficiency, and climate compatible development topics. reegle was launched in 2006 by REEEP and REN21 with funding from the Dutch, German, and UK environment ministries. Originally released as a specialized internet search engine, reegle was relaunched in 2011 as an information portal.
reegle offers and utilizes linked open data . Sources of data include UN and World Bank databases, as well as dedicated partners around the world. reegle maintains a comprehensive structured glossary of energy and climate compatible development terms to assist with the tagging of datasets. The glossary also facilitates intelligent web searches.
reegle offers country profiles which collate and display energy data on a per-country basis for most of the world. These profiles are kept current automatically using LOD techniques.
Renewables.ninja
Project | Renewables,ninja |
Host | |
Status | active |
Scope/type | worldwide hourly PV and wind |
Code license | BSD-new |
Data license | CC BY-NC 4.0 |
Website | |
Repository |
Renewables.ninja is a website that can calculate the hourly power output from solar photovoltaic installations and wind farms located anywhere in the world. The website is a joint project between the , ETH Zurich, Zürich, Switzerland and the , Imperial College London, London, United Kingdom. The website went live during September 2016. The resulting time series are provided under a Creative Commons license and the underlying power plant models are published using a BSD-new license. , only the solar model, written in Python, has been released.
The project relies on weather data derived from meteorological reanalysis models and weather satellite images. More specifically, it uses the 2016 MERRA-2 reanalysis dataset from NASA and satellite images from CM-SAF SARAH. For locations in Europe, this weather data is further "corrected" by country so that it better fits with the output from known PV installations and windfarms. Two 2016 papers describe the methods used in detail in relation to Europe. The first covers the calculation of PV power. And the second covers the calculation of wind power.
The website displays an interactive world map to aid the selection of a site. Users can then choose a plant type and enter some technical characteristics. , only year 2014 data can be served, due to technical restrictions. The results are automatically plotted and are available for download in hourly CSV format with or without the associated weather information. The site offers an API for programmatic dataset recovery using token-based authorization. Examples deploying cURL and Python are provided.
A number of studies have been undertaking using the power production datasets underpinning the website, with the bulk focusing on energy options for Great Britain.
SMARD
Project | SMARD |
Host | German Federal Network Agency |
Status | active |
Scope/type | German, Austrian, and Luxembourg electricity systems |
Data license | |
Website | |
Language | English and German |
The SMARD site serves electricity market data from Germany, Austria, and Luxembourg and also provides visual information. The electricity market plots and their underlying time series are released under a permissive CC BY 4.0 license. The site itself was launched on 3July 2017 in German and an English translation followed shortly. The data portal is mandated under the German Energy Industry Act section §111d, introduced as an amendment on 13October 2016. Four table formats are offered: CSV, XLS, XML, and PDF. The maximum sampling resolution is. Market data visuals or plots can be downloaded in PDF, SVG, PNG, and JPG formats. Representative output is shown in the thumbnail, in this case mid-winter dispatch over two days for the whole of Germany. The horizontal ordering by generation type is first split into renewable and conventional generation and then based on merit.
Further information
- maintained by the Open Energy Modelling Initiative