Third-party read-only API mirrors

Gandalf · July 16, 2021, 4:14pm

So I’m slowly (but surely) going forward with my plan to add Hive to the mix.
It will take time because I have a lot of other tasks on my lists, but nonetheless, OCM is what I have in my workshop.
When I need to play more extensively with data, I’m using GitHub - openchargemap/ocm-data: Snapshots of current Open Charge Map data as a good source to process, but I guess I’ll need to talk with API for some reference comparisons.

So obviously, I’m going to run my own, read-only API mirror.

Trying to quickly spin it up on Ubuntu 20.04 instance ended up with series SIGSEGV, fortunately dockerized OCM Mirror worked like a charm so I didn’t even tried to debug initial issues.

Do you want to make a use of it? Because for most of the time it would be idle, serving no one.
I don’t know your infrastructure, but haproxy, checking third-party mirrors that are contributed by ocm users could check their status (up / down) and health (reasonably up to date?) and if yes, include to your load balancer.

Christopher · July 17, 2021, 3:21am

Hmm, I wonder if the build docs are just out of date, you’d need the latest .net 5.x sdk installed and at least 2GB of ram and swap configured, but otherwise it should just work.

We need to do a little more to coordinate mirror health and for mirrors to re-sync before we can add external mirrors to the pool, but yes it would be a great thing to enable.

Our setup uses a cloudflare worker which reads the list of current mirrors from a key/value store, it then proxies read requests randomly to any of the mirrors. If the request fails we exclude that mirror for a short while.

Currently the mirrors have a few flaw:

our non-docker implementation (a couple of hosted linux VMs) fall over periodically, so auto recovery would be better. AWS and Google Cloud etc all supports this I just haven’t gotten around to doing it.
Our actual data sync is flawed and needs fairly regular rebuilds (clearing the mongodb on the mirror and re-running a full sync). There is probably a simple fix/change required I just haven’t looked at it properly.

I was inspired by your previous post about a JSON diff so I started to (semi-manually) export POIs as JSON on this repo: GitHub - openchargemap/ocm-export: Export of OCM POI data into one file per POI [Experimental] just to get a feel for the benefits. I can see a real benefit of simply using git pull to refresh the latest changes for a mirror, then the mirror re-reads all the affected POIs, rather than using our current API pull based sync.

With regards to Hive, don’t go to a huge amount of effort - without knowing more about what you’re planning I have no idea if what your working on would become part of OCM or not, so I don’t want you to work on something that ends up wasting your time and effort just because I’m not comfortable making that leap or I just don’t see/agree on the benefit.

Gandalf · July 17, 2021, 10:21am

I doubt that the docs are out of date, most likely some minor differences related to my way of setting things up on a new Ubuntu, and slight incompatibilities there, because the docker way does pretty much the same, except sticking strictly with what should be done. Too much improv on my side I guess.

Yes, using GitHub as a generic way to distribute and fetch ocm data is a great way to reduce the load of API nodes, especially during initial sync (since currently it has to be repeated each time nodes are refreshed).

With Hive stuff, don’t worry, my primary objective is to make Hive better by using OCM as a real-life proof of concept / use case for so-called Hive Application Framework. If that would make OCM better - it would be great, but from my point of view that’s a side effect (cool and desired one, but still).

At the very least it could work as an alternate implementation of API Mirrors, but even if not, the value for me is what my team will learn and improve in the process.

Click here for more off-topic

Hive is great and people can build awesome decentralized applications on top of it but there’s a huge entry barrier. One needs to be a wizard to tame the Hive with its quirks and weirdness.

Hive Application Framework that is being developed is going to reduce that entry barrier and greatly improve the way how dApps are build, so one doesn’t have to handle by himself such things like last irreversible block, or micro-forks and focus on their actual app features.

OpenChargeMap is compatible with our ideas: open data, community, crowdsourcing.

It’s ideal example app, that we can use to improve Hive Application Framework, in a way that we will learn on what are the real world needs for such a framework and what kind of issues developers of “classic” apps are facing while dealing with HAF. And no, we won’t take your time, except maybe for a few questions here and there

With a bit of luck and some hard work we hope to end up not only with improved HAF but also with a tutorial / example on how to use HAF to build dApps.

Whether all above or some of the above could be used and to what extent to improve the actual OCM ecosystem - we will see as we go. No pressure there.