I’m looking for the best way to handle a large data request. By large, I am given a file containing about 200,000 to 300,000 different meters for up towards 150 various nonconsecutive days. I might have X meters for day 1, Y meters for day 2…, N meters for day 150. Each meter in the request will be repeated for different dates up 30 days per meter.
Our PI system consist 3 million meters scattered across 6 archive servers. Each meter will only be on one of the six archives.
I wrote a C# program using the PI SDK to handle this large request. Currently, I read the import file and then create a data structure based on date. I will then spawn a process for each date giving the list of meters and the specific date as input to the process; i.e. 150 processes. (I have a time delay of five minutes between every fifth process I spawn.) Each process will make a connection to the 6 archive servers, retrieve the requested data, and write out the meters and their data for the date specified to a file. Thus, I will end up with a file for each specific date in the data request.
This program is working but I’m wondering if there is a better method to handle such a large data request?