Introduction

In a current project I had the challenge to retrieve specific data for certain areas (counties and cities), e.g. green areas (parks, forests, etc). With the data itself we make some calculations, but that is not the focus here. More exciting is the insight into how we get this data. For this we went through several iterations and had to find the best way for us. I would like to show you what the most successful way was for us, using Golang and some Python as an example. We split the project into two parts.

  • Part 1: Fetching the metadata (Polygon) – Python
  • Part 2: Querying the OpenStreetMap(OSM) data – Golang

We will retrieve the metadata using a small python script. The exciting part will be programmed with Golang. The repository, including the final code, is here. https://github.com/AICDEV/osm_poly_harvester

Metadata

In the following we will work with the city of Essen (Germany, Ruhr area). Doesn’t always have to be Berlin! The first step is to get the polygon of Essen. For this purpose Nominatim comes into the game.

Nominatim

“Nominatim (from the Latin, ‘by name’) is a tool to search OSM data by name and address (geocoding) and to generate synthetic addresses of OSM points (reverse geocoding). It can be found at nominatim.openstreetmap.org.”

https://wiki.openstreetmap.org/wiki/Nominatim

The example for Essen can be found here: https://nominatim.openstreetmap.org/ui/details.html?osmtype=R&osmid=62713&class=boundary

We will not work with the web interface, instead we will use the API from Nominatim. You can find the complete API documentation here: https://nominatim.org/release-docs/develop/api/Search/

The record for Essen

Here is a short postman example that show how the API response is structured and how the polygon looks like:

You can give it a try by send a GET request to “https://nominatim.openstreetmap.org/search?country=germany&county=essen&format=geojson&polygon_geojson=1“.

The first thing to notice in the representation of the polygons is that we are dealing with a GeoJSON structure. Should not bother us further. Now that we know what the API response looks like we can define our protocol buffer.

The Protocol Buffer

Protocol buffers are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages.

https://developers.google.com/protocol-buffers

If you haven’t worked with Protocol Buffers before, it’s best to check out Google’s demo repository. Otherwise you can just work with JSON in your code. Here is the language guide: https://developers.google.com/protocol-buffers/docs/overview.

Before we start, let’s have a look on our project structure:

Inside the proto folder we define our protocol buffer. The first draft to test looks like this:

  1 syntax = "proto3";
  2 
  3 option go_package = "github.com/aicdev/osm_poly_harvester/osm";
  4 
  5 package osm;               
  6 
  7 // Nominatim 
  8 message Nominatim {
  9     string type = 1;
 10     NominatimProperties properties = 2;
 11     NominatimBbox bbox = 3;
 12     repeated NominatimGeometry geometry = 4; 
 13 }
 14 
 15 message NominatimProperties {
 16     int64 placeId = 1;
 17     int32 osmId = 2;
 18     string displayName = 3;
 19     int32 placeRank = 4;
 20     string category = 5;
 21     string type = 6;
 22     string osmType = 7;
 23 
 24 }
 25 
 26 message NominatimBbox {
 27     repeated float entry = 1;
 28 }
 29 
 30 message NominatimGeometry {
 31     string type = 1;
 32     repeated NominatimCoordinates coordinates = 2;
 33 }
 34 
 35 message NominatimCoordinates {
 36     float lat = 1;
 37     float lon = 2;
 38 }

To make things a little bit more clear, here is an image that shows the mapping:

Now that we have an initial definition, we still need to compile the Protocol Buffer to Python. For this we prepare a small shell script and install the necessary dependencies for python. But that in the next section.

Python and Protocol Buffer

First, let’s look at what dependencies we need. Basically, you should have the Protocol Buffer Compiler from Google installed. A tutorial for this can be found here: https://grpc.io/docs/protoc-installation/

I have also created a virtual environment for python. Just to separate my dependencies. You can find a tutorial here: https://docs.python.org/3/library/venv.html

We need the following dependencies (requests is for later usages) to compile to protocol buffer into python.

You can simply install this dependencies by running:

#only when your work with venv
source ./env/bin/activate  
./env/bin/pip install -r requirements.txt

#without env
pip install -r requirements.txt

Next step is to define the build script (build_python_proto_buffer.sh).

python -m grpc_tools.protoc -I ../proto --python_out=. osm.proto --grpc_python_out=. -I ../proto -I .

Now it’s time to run the script and watch the output. I recommend to look in the output. This image here is only a short preview:

Inside the nominatim folder we should now have the following files and structure:

In the next section we’ll fetch data from nominatim and convert the into the protocol buffer and store them on our disk.

Python and Nominatim

Now that we have finished our protocol buffer definition and compiled to python, we can query and convert data from Nomimatim.

For this we create a “nominatim.py” file. My project folder inside the nominatim folder now looks like this:

Our challenge is to create the “nominatim.pbf” (Protocol Buffer) file. Here is my python script to do that.

import requests
import osm_pb2

def fetch_osm(city):
    # to test multipolygon your can fetch Hamburg as an example. Just uncomment this request for that
    # res = requests.get('https://nominatim.openstreetmap.org/search', params = {
    #     'state': 'Hamburg',
    #     'country': 'germany',
    #     'format': 'geojson',
    #     'polygon_geojson': 1
    # })

    res = requests.get('https://nominatim.openstreetmap.org/search', params = {
        'county': city,
        'country': 'germany',
        'format': 'geojson',
        'polygon_geojson': 1
    })

    return res.json() 


def res_geometry_to_pbf(geometry):
    # check if city has multipolygon like Hamburg
    
    geom_arr = []

    if geometry['type'] == 'Polygon':
        
        n_geometry = osm_pb2.NominatimGeometry()
        n_geometry.type = geometry['type']


        for geom in geometry['coordinates']:
            for sub_geom in geom:
                coordinate = osm_pb2.NominatimCoordinates()
                coordinate.lat = sub_geom[1]
                coordinate.lon = sub_geom[0]
                n_geometry.coordinates.extend([coordinate])

        geom_arr.append(n_geometry)

        return geom_arr

    elif geometry['type'] == 'MultiPolygon':
        
        n_geometry = osm_pb2.NominatimGeometry()
        n_geometry.type = geometry['type']

        for geom in geometry['coordinates']:
            for sub_geom in geom:
                for entry_geom in sub_geom:
                    coordinate = osm_pb2.NominatimCoordinates()
                    coordinate.lat = entry_geom[1]
                    coordinate.lon = entry_geom[0]
                    n_geometry.coordinates.extend([coordinate])
            geom_arr.append(n_geometry)

        return geom_arr
    else:
        return osm_pb2.NominatimGeometry()

def res_json_to_pbf(res_json):
    n_entry = osm_pb2.Nominatim()

    # get the first entry from nominatim response
    feature = res_json['features'][0]

    # map nominatim properties
    n_entry.type = feature['type']
    n_entry.properties.placeId = feature['properties']['place_id']
    n_entry.properties.osmId = feature['properties']['osm_id']
    n_entry.properties.displayName = feature['properties']['display_name']
    n_entry.properties.placeRank = feature['properties']['place_rank']
    n_entry.properties.category = feature['properties']['category']
    n_entry.properties.type = feature['properties']['type']
    n_entry.properties.osmType = feature['properties']['osm_type']

    # map nominatim bounding box
    n_entry.bbox.entry.extend(feature['bbox'])

    # map nominatim geometry
    n_entry.geometry.extend(res_geometry_to_pbf(feature['geometry']))
    return n_entry

def pbf_to_disk(pbf):
    with open ('./nominatim.pbf', 'wb+') as pbf_out:
        pbf_out.write(pbf)

if __name__ == '__main__':
    res_json = fetch_osm('essen')
    pbf = res_json_to_pbf(res_json)
    pbf_to_disk(pbf.SerializeToString(True))
   

Google has a greate documentation about python and protocol buffers here: https://developers.google.com/protocol-buffers/docs/pythontutorial .The example output of that generated pbf looks like this:

type: "Feature"
properties {
  placeId: 258262283
  osmId: 62713
  displayName: "Essen, Nordrhein-Westfalen, Deutschland"
  placeRank: 12
  category: "boundary"
  type: "administrative"
  osmType: "relation"
}
bbox {
  entry: 6.894344329833984
  entry: 51.347572326660156
  entry: 7.137650012969971
  entry: 51.534202575683594
}
geometry {
  type: "Polygon"
  coordinates {
    lat: 51.476253509521484
    lon: 6.894344329833984
  }
  coordinates {
    lat: 51.47611999511719
    lon: 6.8943562507629395
  },
  ....
}

So far so good. So we have managed the first part. Let’s summarize again briefly:

  • We have a Protocol Buffer Definition for Nominatim API responses
  • We have a way to compile Protocol Buffer to Python
  • We can use a Python script to query data from Nominatim API, convert it to a Protocol Buffer and serialize it and store it on our disk

Next, we want to read in the Protocol Buffer in our Golang application and query artifacts (forests, meadows, etc.) from OpenStreetMap for that area (polygon).

OpenStreetMap

We will work with the OSM API (overpass) from now on. You can find a good overview here: https://wiki.openstreetmap.org/wiki/Overpass_API. But please note that this is an opensource project. So if you want to fetch data from OSM regularly, you can simply deploy your own instance and not cause unnecessary traffic. You can find a tutorial here: https://wiki.openstreetmap.org/wiki/Overpass_API/Installation

Golang and Protocol Buffer

Inside our osm folder we run:

go mod init

to initialize our new go project. Then, as we did in python before, we need to install some dependencies to compile the protocol buffer. Please make sure that you have installed the following dependencies (if you run in trouble with golang and protocol buffer, check: https://grpc.io/docs/languages/go/quickstart/):

go get -u google.golang.org/protobuf/cmd/protoc-gen-go
go get -u google.golang.org/grpc/cmd/protoc-gen-go-grpc
go get -u google.golang.org/grpc 

Next step is to create our build script. Therefore we create the file “build_golang_proto.sh” and a folder “proto” inside the osm folder and paste the following content:

protoc --go_out=./proto --go_opt=paths=source_relative --go-grpc_out=./proto --go-grpc_opt=paths=source_relative osm.proto -I ../proto -I .

Then run the script and watch the output inside the ./proto folder. I recommend to look in the output. This image here is only a short preview:

Now that we also have our protocol buffer in go, we can focus on retrieving the data from OSM. Let’s do that.

Golang and OpenStreetMap

Again, briefly summarizing what exactly our goal is:

  • Protocol Buffer read from disk
  • Convert polygon to OSM query
  • Query and display artifacts to an area

The first part is very easy and can be achieved by a few lines of code:

package main

import (
	"fmt"
	"io/ioutil"
	"log"

	"github.com/aicdev/osm_poly_harvester/osm/app"
	"github.com/aicdev/osm_poly_harvester/osm/overpass"
	pb "github.com/aicdev/osm_poly_harvester/osm/proto"
	"google.golang.org/protobuf/proto"
)

func main() {
	app.StartApplication()

	/******************************************************************
	* read the nominatim.pbf from ../nominatim/nominatim.pbf
	* and parse the content to our pb struct
	*******************************************************************/

	in, err := ioutil.ReadFile("../nominatim/nominatim.pbf")
	if err != nil {
		log.Fatalln("Error reading file:", err)
	}

	nominatim := &pb.Nominatim{}
	if err := proto.Unmarshal(in, nominatim); err != nil {
		log.Fatalln("Failed to parse nominatim:", err)
	}

	log.Printf("successfully deserialized nominatim pbf for: %s", nominatim.GetProperties().GetDisplayName())

	overpassService := overpass.NewOverpassService(nominatim.GetGeometry())
	overpassService.Init()
	overpassService.FetchOSMData()

	for _, ovr := range overpassService.GetOverpassResponse() {
		fmt.Println(ovr)
	}
}

As you can see we simply import the generated nominatim.pbf from our nominatim folder (the stuff that we’ve done with the python script) and also import pb “github.com/aicdev/osm_poly_harvester/osm/proto” which is the compiled nominatim protocol buffer in golang. Then I called something “OSM Query”. But what exactly is this? Quite simply, this is the query language that the Overpass interpreter understands. I must admit that this query language is a bit hard to get used to, but with googling and trying it always works out in the end. There is a online playground: https://overpass-turbo.eu/. Now we want to query the following artifacts from our area:

			"way[\"leisure\"=\"park\"](poly: \"%s\");",
			"way[\"leisure\"=\"forest\"](poly: \"%s\");",
			"way[\"landuse\"=\"meadow\"](poly: \"%s\");",
			"rel[\"leisure\"=\"park\"](poly: \"%s\");",
			"rel[\"leisure\"=\"nature_reserve\"](poly: \"%s\");",
			"rel[\"landuse\"=\"forest\"](poly: \"%s\");",

The poly at end on each line is the former requested polygon from nominatim. Then we need some lines of go to create a dynamic query template and store the response. The code looks like this (no worry, the whole project is on github;link is at the end of this post):

package overpass

import (
	"encoding/json"
	"fmt"
	"io/ioutil"
	"log"
	"net/http"
	"net/url"
	"strconv"
	"strings"
	"time"

	pb "github.com/aicdev/osm_poly_harvester/osm/proto"
)

type overpassService struct {
	QueryFragements     []string
	AreaPolyString      string
	Templates           []string
	OverpassResponse    map[string]interface{}
	OverpassResponseArr []map[string]interface{}
}

type OverpassServiceInterface interface {
	Init()
	FetchOSMData()
	GetOverpassResponse() []map[string]interface{}
}

func NewOverpassService(area []*pb.NominatimGeometry) OverpassServiceInterface {
	return &overpassService{
		QueryFragements: []string{
			"way[\"leisure\"=\"park\"](poly: \"%s\");",
			"way[\"leisure\"=\"forest\"](poly: \"%s\");",
			"way[\"landuse\"=\"meadow\"](poly: \"%s\");",
			"rel[\"leisure\"=\"park\"](poly: \"%s\");",
			"rel[\"leisure\"=\"nature_reserve\"](poly: \"%s\");",
			"rel[\"landuse\"=\"forest\"](poly: \"%s\");",
		},
		AreaPolyString: parseAreaCoordinatesToOSMPoly(area),
	}
}

func (os *overpassService) Init() {
	os.Templates = make([]string, 0)
	for _, v := range os.QueryFragements {
		rawTpl := `
		[out:json];
		(
			` + fmt.Sprintf(v, os.AreaPolyString) + `
		);
		out body geom;
		>;
		out skel geom;
		`
		os.Templates = append(os.Templates, rawTpl)
	}
}

/*******************************************************************
* fetch data from api/interpreter
*******************************************************************/
func (os *overpassService) FetchOSMData() {
	for _, v := range os.Templates {

		data := url.Values{}
		data.Set("data", v)

		req, _ := http.NewRequest("POST", "https://overpass-api.de/api/interpreter", strings.NewReader(data.Encode()))

		req.Header.Add("Content-Type", "application/x-www-form-urlencoded")
		req.Header.Add("Content-Length", strconv.Itoa(len(data.Encode())))

		client := &http.Client{
			Timeout: 180 * time.Second,
		}
		res, err := client.Do(req)

		if err != nil {
			log.Printf("error from osm: %s", err.Error())
		}

		defer res.Body.Close()

		body, err := ioutil.ReadAll(res.Body)

		if err != nil {
			log.Fatal(err.Error())
		} else {
			ovr := os.OverpassResponse
			json.Unmarshal(body, &ovr)
			os.OverpassResponseArr = append(os.OverpassResponseArr, ovr)
		}

	}
}

func (os *overpassService) GetOverpassResponse() []map[string]interface{} {
	return os.OverpassResponseArr
}

func parseAreaCoordinatesToOSMPoly(area []*pb.NominatimGeometry) string {
	parsedCoordinates := ""
	for _, c := range area {
		for i, v := range c.GetCoordinates() {
			if i == len(c.GetCoordinates())-1 {
				parsedCoordinates += fmt.Sprintf("%f %f", v.GetLat(), v.GetLon())
			} else {
				parsedCoordinates += fmt.Sprintf("%f %f ", v.GetLat(), v.GetLon())
			}
		}
	}

	return parsedCoordinates
}

And voila here is our output:

Looks a bit confusing. We run one of our query templates in Overpass Turbo, so we can better see what the answer looks like:

Summary

I hope this post has shown one or the other how to query for very specific areas, precisely defined artifacts. Also I hope that the people who have not yet made contact with the subject of Protocol Buffers, now have the desire to deal with it more deeply. What happens in the end with the query data from OSM always depends on the use-case. For example, in our project we also stored the answer in a protocol buffer and did further calculations. Thanks for reading and have fun working with Nominatim or Open Street Map.

Categories:

No responses yet

Leave a Reply

Your email address will not be published. Required fields are marked *