Home Forum Community URAD data Reply To: URAD data

#6391
ecloud
Participant

A fee, are you kidding? I see this as a community project. The more you centralize or throttle or keep proprietary any aspect of it, the less useful it becomes to the world.

Yes I know storage costs money. But some of us would be willing to host our own data. I’d like to use IPFS somehow, but not sure yet whether there is a database that is a good fit for this kind of time-series data on IPFS. Maybe will evaluate OrbitDB, but it’s written in JS so I’d have to run it on NodeJS… ick.

I have experience with InfluxDB. I have doubts about its ability to scale, based on what I’ve heard at work (we use it for another purpose: automated software benchmarkes and test results and such). But on my system at home ATM it’s only taking up 157MB of memory, and I’ve been recording the data from one uRadMonitor at a time more or less continuously since February 2017, and a few other data points from other devices here and there. We’ll see how much it increases later. (I was just thinking that I should use influx to log its own memory usage, so that I can graph it and look for a trend.) My crappy little script to insert data (which I run from a cron job) is at https://github.com/ec1oud/uradmonitor-influxdb-inserter But I suspect if you try to store all the data worldwide it will be much more demanding.

I used to use https://en.wikipedia.org/wiki/RRDtool (a round-robin database) for weather data years ago. It didn’t have performance problems (being designed a couple of decades ago for the computers we had back then), but you have to decide in advance how the data will be resampled or otherwise reduced as it ages. The idea is to have high-resolution data for the most recent time period and summarized data for longer periods. I’m not fond of the idea of losing any of this data too quickly; who knows how many years is really enough. Whereas with Influx you can just start storing data very easily, and change the “retention policy” later on when it gets too unwieldy.

Anyway if you can’t afford the storage, it would be nice if you can provide a worldwide aggregated stream of some sort so that the rest of us can experiment with alternate means of storage. Perhaps a multicast channel would work, sending out a big JSON packet once per minute with all the readings from all the devices? It would not require much bandwidth.