[SOLVED] methods and ways to export or query all hospitals from Open Street Map

Apr 30, 2020
23
0
10
How can I export or query all hospitals from Open Street Map and maintain the data in a postgresql-db for further analysis and steps...

'd like to extract all the hospital locations in the from Open Street Map as a reference for my ,little geocoding project. I know how to get all the OSM data for a small area with the QGIS plugin but not sure how to query a larger area eg for the whole planet-file?

some ideas:
1) The read-only Overpass API. I don't know that it'll work for the whole planet-file in one pass, but maybe if we extend the time out enough?
For a smaller area and with the benefit of a (minimal) UI, we can access Overpass via the XAPI Query Builder. There, we can put amenity=hospital in the tag search, select our area, and go.
2) Geofabrik downloads: filtered with Osmosis, as described in How to extract partial data for large regions? on the OpenStreetMap Help.
3.what about using healthsites.io: regarding healthsites this is the most important link: https://healthsites.io/api/docs
can i use healthsites.io for my project!?

what is aimed: i want to maintain the dataset in a pustgresql-db
update: since i am interested especially in the dataset of healthsites.io i can download it at this page:_ https://healthsites.io/map

how do you think about this approach?!
 
Solution
Two part project:

1) "The data will flow .... from Healthsites.io to OpenStreetMap"

2) Then "OpenStreetMap to Healthsites.io "

Not sure abut "to a first source that I use" - do you mean the first step in processing the data to fit your requirements?

Provided that I am following things correctly my suggestion is to just try it.

Determine what works and what does not work.

I.e., download the healthsite.io data to a file. What format is the data: csv, txt, xls, or fixed field?

Hopefully you can work with that imported data directly and not need to change the format. The fewer the steps the better. Simplicity.

Once you know the data fields and formats then you can use/apply your database tools to further refine the data to...

Ralston18

Titan
Moderator
You will need to experiment a bit.

How the hospital data (records, fields, types) is structured and presented (download) is important. For the most part you may need to do some additional filtering and/or parsing to obtain the required data.

I.e., you first download the data from the source, then select, sort, parse the data into a new table that fits your requirements. (Versus trying to do that via the download options (if any). However, if there are source site download options then they may simplify what you need to do with the data you download.

Try some controlled data downloads for a smaller geographic area. (There may be a filter that you can used for the initial download. For example the site download option may provide a menu where you can select location "X" where location is a country, state, city....

Especially a Location X where you already know what hospitals are there and where they are located.

Work on extracting the hospital locations from that data until you can get a match between your results and the previously identified "known" hospitals and Locations X.

Then use that algorithm to test a larger geographic area.

Most likely you will need to make modifications to your rules, filters, etc. to extract the hospital data.

Doubt that one rule will extract them all unless there is some common data value that identifies a "hospital". Could be an icon or some key word used on the maps.

The key is to be methodic and always have the ability to verify the results.

If you pull hospital data from two or more sources then compare results. My expectation would be that Dataset 1 and Dataset 2 will have overlapping data (Venn diagram) but each dataset will still have hospitals not found in the other.

The end requirement being able to merge the datasets to get the complete set of hospitals for the targeted geographical area.

As for tools: Python is quite powerful. May give you more overall control with regards to your requirements and what needs to be done to capture the hospital data.
 
Apr 30, 2020
23
0
10
hello again - great to hear from you .

i fully agree with your ideas. they sound convincing.


btw i am only interested in the POI - so i only need the records of the hospitals.
perhaps i can filter with osmfilter.

this will reduce the dataset.

see more specs of healthsite.io o approach

https://wiki.openstreetmap.org/wiki/Global_Healthsites_Mapping_Project
Healthsites.io <> OpenStreetMap

The data will flow both ways from Healthsites.io to OpenStreetMap, and from OpenStreetMap to Healthsites.io. On the sign up to the Healthsites.io page, users will register using OAuth against the OSM authentication provider. Each data change (create/update/delete) to a health facility on the Healthsite.io platform will be written directly to the OSM database using the OSM API, with the OSM credentials associated with the logged in user on Healthsites.io.

Changes to health facility data made outside of the Healthsites.io platform, and directly to OSM are replicated back in near-real time to a Healthsites.io mirror of all health facility data found in OSM. This is achieved using docker-osm, developed by Kartoza which takes the high frequency diffs produced on OSM and applies them to a PostgreSQL / PostGIS database hosted on Heathsites.io. Changes made locally on healthsites.io are first pushed to OSM via the OSM API and then replicated back to our docker-osm instance using the same mechanism described above.

Our aim with this architectural approach is to make OSM the main storage location of all the health facility data available on Healthsites.io and at the same time facilitate large queries, extracts and general innovation around the body of OSM health facility data, whilst having minimal impact and load on the services offered by OSM. The diagram below illustrates the high level architecture as pertains to interactions between OSM and Healthsites.io.


Data Model

The OSM data model put together for Healthsites was initially based off of several data models used in various HOT projects across South America, Africa and Southeast Asia. The initial list of tags was then compared to the data model that Healthsites.io was already using to ensure that the same attributes were still being captured, but also offered some suggestions for additional key information that should be captured for health facilities. This was based on the numerous collaborations with HOT project partners over the years and their data needs.

The overall idea is to use existing OSM tags where possible and only propose new tags where necessary. A lot of research went into the development of the Healthsites OSM data model, including the review of existing HOT data models and OSM data models available on the OSM wiki. Verification of the tags usage through the OSM wiki, Tag Info and Tag History, with further analysis on tags using OSMFilter for certain areas of interest, helped identify values generally applied in a local area.

what about to take healthsites.io - to a first source that i use!=?
 

Ralston18

Titan
Moderator
Two part project:

1) "The data will flow .... from Healthsites.io to OpenStreetMap"

2) Then "OpenStreetMap to Healthsites.io "

Not sure abut "to a first source that I use" - do you mean the first step in processing the data to fit your requirements?

Provided that I am following things correctly my suggestion is to just try it.

Determine what works and what does not work.

I.e., download the healthsite.io data to a file. What format is the data: csv, txt, xls, or fixed field?

Hopefully you can work with that imported data directly and not need to change the format. The fewer the steps the better. Simplicity.

Once you know the data fields and formats then you can use/apply your database tools to further refine the data to meet your requirements.

Getting/importing data from multiple sources is always problematic. It does help if all of the sources are adhering to some strict/standardized format. Your/the data model.

However, there could be some change on a data source end that un-does your efforts. You fix it on your end but then the other data imports may not work.

Overall, sketch the data flows all out step by step. Keep the steps simple. Develop a plan and then work through each step to manipulate the data per the requirements.

Project Part 1 first, then Project Part 2 second.

Importing data from a source does not risk the original data. However once you start exporting data back then the risks are much higher to the original data: You need much more protection to avoid data corruption, invalid data, human errors, and so forth.

Security and backups.

Focus on achieving a working manual process first then automate when viable. Make it work, then make it "pretty".
 
Solution
Apr 30, 2020
23
0
10
hello and good day - many thanks for the reply.

btw: when trying to get the data from overpass-turbo i get somewhat full set of POI data

Code:
[out:csv(::id,::type,"name","addr:postcode","addr:city","addr:street","addr:housenumber","website"," contact:email=*")][timeout:600];
area["ISO3166-1"="AT"]->.a;
( node(area.a)[amenity=hospital];
  way(area.a)[amenity=hospital];
  rel(area.a)[amenity=hospital];);
out;

see the results:

Code:
@id    @type    name    addr:postcode    addr:city    addr:street    addr:housenumber    website     contact:email=*

2656877    relation    Unfallkrankenhaus Lorenz Böhler    1200    Wien    Donaueschingenstraße    13    https://www.auva.at/ukhboehler/   
2685003    relation    Landeskrankenhaus Hörgas-Enzenbach                    [url=http://www.lkh-hoergas.at]http://www.lkh-hoergas.at[/url]   
2685004    relation    Landeskrankenhaus Judenburg-Knittelfeld                    [url=http://www.lkh-judenburg.at]http://www.lkh-judenburg.at[/url]   
2685005    relation    Landeskrankenhaus Hochsteiermark                    [url=http://www.lkh-hochsteiermark.at]http://www.lkh-hochsteiermark.at[/url]   
2685006    relation    Landeskrankenhaus Mürzzuschlag-Mariazell                    [url=http://www.lkh-muerzzuschlag.at]http://www.lkh-muerzzuschlag.at[/url]   
2685007    relation    Landeskrankenhaus Rottenmann-Bad Aussee                    [url=http://www.lkh-rottenmann.at]http://www.lkh-rottenmann.at[/url]   
2755244    relation    Landesklinikum Scheibbs    3270    Scheibbs    Feldgasse    26       
2764083    relation    Allgemeines Krankenhaus Wien    1090    Wien    Währinger Gürtel    18-20    [url=https://www.akhwien.at]https://www.akhwien.at[/url]   
2882269    relation    Unfallkrankenhaus Salzburg                    [url=http://www.ukh-salzburg.at]http://www.ukh-salzburg.at[/url]   
3333391    relation    Krankenhaus St. Josef    5280    Braunau am Inn    Ringstraße    60    https://www.khbr.at/   
4460484    relation    Landeskrankenhaus Graz II                    [url=http://www.lkh-graz-sw.at]http://www.lkh-graz-sw.at[/url]   
6704364    relation    Landesklinikum Neunkirchen            Peischingerstraße    19    https://www.neunkirchen.lknoe.at/   
9522773    relation    Krankenhaus der Barmherzigen Brüder    7000    Eisenstadt    Esterhazystraße    26       
11077211    relation    "Landeskrankenhaus Murtal, Standort Stolzalpe Haus 1"    8852    Stolzalpe    Stolzalpe    38    https://www.lkh-murtal.at/

if you look here - this is much much more incomplete

https://healthsites.io/map#!/locality/node/4777740642

no complete adress - no website etc. etx.

conclusio: should i stick to a request with overpass-turbo.eu - is this more appropiate?!