Reduce memory usage of sense.py for long-running data acquisition sessions
Currently for sense, acquired data is written to disk in two forms. Firstly a log file is written line by line, and secondly a compressed binary data file is written at the end of the session when the script is killed with ctrl-c
. Referring to the latter, this means that all sampled data is held in main memory until the script is terminated.
A recent (accidental) logging session ran for 40 days on a 2GB Raspberry Pi. This consumed the available memory, used the majority of the available swap space and left the machine slow to respond; however, it remained usable.
Arguably, the binary data file is not required, since it can (with a small loss of precision) be derived from the log file using log2dat. However, its direct creation is convenient and retains full precision.
The solution appears to be to stream binary data to disk as long as this can be done without thrashing the SD card, which may be relatively fragile with excessive writes. The use of an SD card suited to video surveillance workloads may be appropriate here, or an external USB SSD. It might be plausible to batch the rows of acquired data and write them, say, every 30 minutes, with a final flush at termination. We could also stream this directly to a ROOT file.
See Pandas IO tools, and look at using pandas.HDFStore.append in one of the following forms, where the local df
contains the row to be written:
df.to_hdf('store.h5', 'table', append=True)
pd.HDFStore('store.h5').append('table_name', df)
Note that the problem described in this issue was seen on an older version of the sense.py
script, but it should be fixed here.