Finding the most efficient way to handle large data request.

I’m looking for the best way to handle a large data request. By large, I am given a file containing about 200,000 to 300,000 different meters for up towards 150 various nonconsecutive days. I might have X meters for day 1, Y meters for day 2…, N meters for day 150. Each meter in the request will be repeated for different dates up 30 days per meter.

Our PI system consist 3 million meters scattered across 6 archive servers. Each meter will only be on one of the six archives.

I wrote a C# program using the PI SDK to handle this large request. Currently, I read the import file and then create a data structure based on date. I will then spawn a process for each date giving the list of meters and the specific date as input to the process; i.e. 150 processes. (I have a time delay of five minutes between every fifth process I spawn.) Each process will make a connection to the 6 archive servers, retrieve the requested data, and write out the meters and their data for the date specified to a file. Thus, I will end up with a file for each specific date in the data request. 

This program is working but I’m wondering if there is a better method to handle such a large data request? 

 

Parents
  • Even with the 5 minute time delay for every X processes, I ended up spawning a lot of threads that didn't complete before the next batch started leading to this error:

    OSIsoft.AF.PI.PIException: [-11148] Maximum number of concurrent bulk queries limit exceeded

      at OSIsoft.AF.PI.PIPageProcessor`2.ProcessPagesInternal()

      at OSIsoft.AF.PI.PIPageProcessor`2.ProcessPagesInternal(Object objectParameters)

      at System.Threading.Tasks.Task.InnerInvoke()

      at System.Threading.Tasks.Task.Execute()

     

    So I simply, set up a monitor for number of processes. If the process count reaches a configurable number of processes, pause the main program until the process count drops. This is now working great with no errors.

Reply
  • Even with the 5 minute time delay for every X processes, I ended up spawning a lot of threads that didn't complete before the next batch started leading to this error:

    OSIsoft.AF.PI.PIException: [-11148] Maximum number of concurrent bulk queries limit exceeded

      at OSIsoft.AF.PI.PIPageProcessor`2.ProcessPagesInternal()

      at OSIsoft.AF.PI.PIPageProcessor`2.ProcessPagesInternal(Object objectParameters)

      at System.Threading.Tasks.Task.InnerInvoke()

      at System.Threading.Tasks.Task.Execute()

     

    So I simply, set up a monitor for number of processes. If the process count reaches a configurable number of processes, pause the main program until the process count drops. This is now working great with no errors.

Children
No Data