Relationship of 'ConnectionType' and 'NumberOfPoints' missing from CSV export

Hello there,
I rely on your database for a data analytics project. I have a more general question regarding the style of contributions so I can decide how to process data in an automated fashion.

For example OCM-108484, Brüsseler Platz, Essen, Germany.
On your website and in the API .csv export the ‘NumberOfPoints’ is listed as 27.
However, the .csv export does not appear to contain information on how many points of each connection type are present. On the website on the other hand, this information is provided.

Now I was wondering how I can get this information into the exported database so I can work with it.
This presents a crucial problem for me right now and I’m very grateful for every response.

Warm regards

For your information, I process the database in R.
This is the API call:
These are all variables I have in my csv export:


You will find that the JSON export is much better than the basic CSV export, we don’t currently plan to change the CSV export as we would much rather people use the full JSON data and CSV changes can affect brittle systems that don’t expected the output to change.

A quick way to get the latest data set is to use git to clone (and subsequently pull) our data snapshot from github:

You can then gunzip poi.json.gz and work with the entire dataset. A small sample off that data would look like:

1 Like

Thank you very much for your quick answer.
Unfortunately, I’m slightly at a loss with handling the JSON data.

First, I’m retrieving the file you mentioned from GitHub:
Then I unzip the file:
gunzip("OCM_DE.json.gz", remove=FALSE)
Then I import the unzipped file into R with:
OCM_DE <- fromJSON(file = "OCM_DE.json")

Unfortunately, the resulting data in R looks awfully empty (see image below). I also tried manually downloading and extracting and only doing the import within R. Unfortunately, to no luck.
If you could give me another hint as to how I should best handle this, that would be wonderful.

Warm regards

Nevermind, using jsonlite::stream_in instead of fromJSON worked wonders.
Doing this works perfectly and provides comprehensive information as pointed out by @Christopher
For anyone trying to import the OCM database as JSON within R, please do the following:

  1. Download the JSON file:
  2. Unzip the archive:
    gunzip("OCM_DE.json.gz", remove=FALSE)
  3. Import data into R:
    OCM_DE <- jsonlite::stream_in(file("OCM_DE.json"))
    Note: You can use ndjson::stream_in("OCM_DE.json") as pointed out here, but the result is not ideal in my opinion.

If we investigate the example from above ( OCM-108484), we notice that the connection types are stored as a nested data frame.
View(OCM_DE[OCM_DE$ID == 108484,])
To access the nested data frame, simply do the following: View($Connections[OCM_DE$ID == 108484]))

Thank you very much for the help provided and especially the database. It’s helping me so much for my project!

Warm regards

1 Like