Weird behavior processing IEnumerable<AFEventFrame>

I have some code where I search for some event frames, then try to process the results.

The issue I have is that processing the IEnumerable results is very, very slow. It consistently takes my application time from 5 seconds to 45 seconds.

 

Here is the code:

        public void LoadClosedTasks(CurrentShift currentShift)
        {
            //CodeTimer LoadClosedTasksTimer = new CodeTimer();
            if (currentShift == null) return;
            List<AFSearchTokenBase> tokens = AppliedSearchTokens.ToList();                  //Copy the list
            tokens.RemoveAll(x => x.Filter == AFSearchFilter.InProgress);                   //Remove tokens for "In Progress"
            tokens.Add(new AFSearchValueToken("|COMPLETED_SHIFT_ID", currentShift.ShiftID.ToString()));
            IEnumerable<AFEventFrame> RetrievedFrames = null;
            using (AFEventFrameSearch search = new AFEventFrameSearch(LogDatabase, "baseSearch", tokens))
            {
                //DateTime searchStart = DateTime.Now;
                RetrievedFrames = search.FindObjects(0, true, 0);
            }
            ConcurrentBag<OperatorTask> createdFrames = new ConcurrentBag<OperatorTask>();  //Add to concurrent bag to avoid multi-thread issues.

            return;

            Parallel.ForEach(RetrievedFrames, frame =>
                {
                    //OperatorTask frameInstance = new OperatorTask();
                    //frameInstance.PopulateFromFrame(MainForm, frame, this);
                    //createdFrames.Add(frameInstance);

                });
            }

For info, this routine is called using an await Task.Run.

You can see where I place the "return" statement in the code, to bypass the Parallel.ForEach() block. With return there, the application loads (doing lots of other stuff) in 4.5 seconds. If I remove the return statement, the application takes 45 seconds. And this is with the code body in the Parallel.ForEach block commented out, as it shows above! It is not the code in the block that is slow, because there is no code running and I still get the problem.

I also tried a synchronous ForEach across the collection. Same issue.

This suggests to me that the simple act of enumerating the returned search is taking 40 seconds. (Timing it confirms that).

 

So... how many elements is it iterating?

 

Glad you asked. The search results are returning exactly 1 result!

 

Any idea on what is happening here? Search.FindObjects does have LoadFull set to true, but switching it to false doesn't help. Setting the page size doesn't help. I really am at a loss on how to fix this.

Parents
  • You have obfuscated so much of your code that its difficult to focus on the problem. For example, you have an AppliedSearchTokens property that offers NO clues as to its contents. Does it filter on an event frame template? Does it filter on an element?

     

    Does attribute |COMPLETED_SHIFT_ID use a data reference? Some times, never, always?

     

    Performance is not determined by how many event frames are in your result set. It's about how many event frames are sent to the local host performing any client-side filtering.

     

    The RetrievedFrames property should be defined as a List or IList<AFEventFrame>, and the search loop should have

     

     RetrievedFrames = search.FindObjects(0, true, 0).ToList();

     

     

  • Hi Rick,

    AppliedSearchTokens is just a general list of tokens common to searches for a specific asset. There are three:

    1.) AFSearchFilterToken on AfSearchFilter.TemplateName

    2.) AFSearchValueToken on "|SITE" attribute (name data type)

    3.) AFSearchValueToken on "|Completed_Shift_ID", which is the GUID of an EventFrame (EventFrame.ID)

     

    Why do you recommend converting RetrievedFrames to a List or IList? The search.FindObjects method returns IEnumerable<AFEventFrame>; all I want to do is enumerate over it. List is derived from IEnumerable, and I don't need any of the additional properties that List gives me.

  • You did not answer my question about "|Completed_Shift_ID", Does it use a data reference? You say it is a Guid of an event frame. What relation, if any, does the "|Completed_Shift_ID" have to the found event frame's ID property?

     

    Does the "|SITE" attribute have a data reference?

     

    An AFSearch is a disposable object. It applies server-side filters against ALL event frames in the PIFD, and may send event frames to the client for additional client-side filtering. It also is a forward-only iteration. All of that means that if you want to persist found event frames beyond the scope of the disposable search, then you need to persist to the some appropriate object, and IEnumerable<AFEventFrame> is NOT such an object. That way, when you want to process over the found event frames, you do not have to take a lengthy search again, but rather have them readily available in a nice memory structure such as a List<AFEventFrame>. As for the List properties you do not need, may I be so bold as to suggest that you can ignore them?

     

    And, no, List is not derived from IEnumerable. Rather, List implements the IEnumerable interface by providing concrete code for the interface method signatures.

     

  • |Completed_Shift_ID is a string value type, with no data reference. |Site is a String Builder data reference, specifically to %..\..\Element%. All other attributes are String Builders. I understood that the FullLoad parameters means that these are evaluated for the EventFrames returned.

    I understand about disposable objects. Disposing of the Seach object does not dispose the results returned by it. And even if it did, IList derives from IEnumerable - they both store references to the underlying objects, not copies of those objects.

    As I said in my first reply, I tried processing the returned event frames both inside and outside the using. No difference in result, nor performance.

     

  • Good day again, Andrew. I am not trying to be rude, but you do seem to have a fundamental misunderstanding regarding .NET as well as how AFSearch operates. What you call "weird behavior" can be explained. Whether or not you accept the explanation depends upon your willingness to learn and your desire to improve the performance.

     

    IEnumerable<T> is an interface. It declares properties and method signatures for any class that implements the interface. In .NET Framework, an interface lacks any code. Thus, List<T> is NOT derived from IEnumerable<T>. Rather, List<T> implements its own specific code to adhere to the interface signatures required by the IEnumerable<T> interface.

     

    AFSearch classes produce a forward-only collection. The FindObjects method, or specifically the AFEventFrameSearch.FindObjects method does NOT actually return AFEventFrame items. Rather it returns a type of IEnumerable<AFEventFrame>, which potentially can return an enumerable list of the objects found. I say potentially because that returned type must be enumerated in order to bring the individual AFEventFrame items into memory. One way to achieve this is by using a foreach. Another way is to use the LINQ ToList() extension method.

     

    By all means you can keep RetrievedFrames as IEnumerable. But you still need to change your code to:

     

    IEnumerable<AFEventFrame> RetrievedFrames = null;
    using (AFEventFrameSearch search = new AFEventFrameSearch(LogDatabase, "baseSearch", tokens))
    {
        RetrievedFrames = search.FindObjects(0, true, 0).ToList();
    } // search object is disposed and any server-side cache is disposed.

    As of now, you are not using server-side caching. None the less, the variable called search is disposed. The found items have been enumerated into memory and saved. This is the critical step you have been missing.

     

    Now let's look at PERFORMANCE.

     

    Your search query will perform server-side filtering on (1) the template name and (2) |Completed_Shift_ID because |Completed_Shift_ID  is a static attribute, i.e. does not use a data reference.

     

    Quick definitions: a closed event frame is one where the EndTime is assigned, and an open or in-progress event frame is one in which the EndTime is not assigned. Technically, the EndTime always has a value; an open, blank, or seemingly unassigned EndTime is actually equal to AFTime.MaxValue.

     

    For event frames that are NOT in-progress, server-side filtering may be used with "|SITE", but only IF you have captured values. Note value capturing for closed event frames is the default.

     

    If you have not captured values OR the event frame is in-progress, then "|SITE" will be filtered client-side. In which case, a superset of event frames is sent to the client PC, where the AFSearch applies the client-side filters to reduce this superset to the final set, which you can enumerate over INSIDE of your AFSearch block.

     

    You have not shared any particulars about how many total event frames are in your PI System, or how many are used by the requested template. It would be nice to know how many you are truly expecting to be output, as well as a rough guess as to how many are to be sent for client-side filtering. That is, it would be helpful to know how big the superset will be versus the final set.

     

    This help is taken from the REMARKS section from the online help:

     

    <quote>

    To optimize getting items from several pages of search results, a search can be cached by setting the CacheTimeout to a non-zero value. If you will only be getting items from the first page, then it is best to leave the cache disabled. The cache is disabled by default. 

    </quote>

     

    There is a lack of clarity regarding "first page". The number of RPC calls, aka trips to the server, is not determined by the first page of final results, that is event frames that have passed the client-side filter. Rather, it is the first page of the superset of results that have passed the server-side filter. If you think it the search requires an additional trip to fetch more items, then you should employ server-side caching for a nice performance boost. If you think the superset will be less than a full page (default is 1000 items), then you do not need server-side caching.

     

    If you do need server-side caching, this code can be used.

    IEnumerable<AFEventFrame> RetrievedFrames = null;
    using (AFEventFrameSearch search = new AFEventFrameSearch(LogDatabase, "baseSearch", tokens))
    {
        search.CacheTimeout = TimeSpan.FromMinutes(10);
        RetrievedFrames = search.FindObjects(0, true, 0).ToList();
    } // search object is disposed and any server-side cache is disposed.

    I give a very generous cache timeout, but that is okay. Whether your search takes 2 seconds or 2 minutes, the cache will be immediately disposed of thanks to the closing brace on line 6.

     

    So I would suggest you use ToList() and you may want to experiment with using a cache timeout.

     

    Beyond that, there are potential performance benefits to using a "skinny search", that is to say using a FindObjectFields overload. But this would only be worth pursuing if (A) none of the event frames passing server-side filters are in-progress and (B) you have captured values. If neither of those conditions are met, then the skinny search will resort to being a fat one.

     

Reply
  • Good day again, Andrew. I am not trying to be rude, but you do seem to have a fundamental misunderstanding regarding .NET as well as how AFSearch operates. What you call "weird behavior" can be explained. Whether or not you accept the explanation depends upon your willingness to learn and your desire to improve the performance.

     

    IEnumerable<T> is an interface. It declares properties and method signatures for any class that implements the interface. In .NET Framework, an interface lacks any code. Thus, List<T> is NOT derived from IEnumerable<T>. Rather, List<T> implements its own specific code to adhere to the interface signatures required by the IEnumerable<T> interface.

     

    AFSearch classes produce a forward-only collection. The FindObjects method, or specifically the AFEventFrameSearch.FindObjects method does NOT actually return AFEventFrame items. Rather it returns a type of IEnumerable<AFEventFrame>, which potentially can return an enumerable list of the objects found. I say potentially because that returned type must be enumerated in order to bring the individual AFEventFrame items into memory. One way to achieve this is by using a foreach. Another way is to use the LINQ ToList() extension method.

     

    By all means you can keep RetrievedFrames as IEnumerable. But you still need to change your code to:

     

    IEnumerable<AFEventFrame> RetrievedFrames = null;
    using (AFEventFrameSearch search = new AFEventFrameSearch(LogDatabase, "baseSearch", tokens))
    {
        RetrievedFrames = search.FindObjects(0, true, 0).ToList();
    } // search object is disposed and any server-side cache is disposed.

    As of now, you are not using server-side caching. None the less, the variable called search is disposed. The found items have been enumerated into memory and saved. This is the critical step you have been missing.

     

    Now let's look at PERFORMANCE.

     

    Your search query will perform server-side filtering on (1) the template name and (2) |Completed_Shift_ID because |Completed_Shift_ID  is a static attribute, i.e. does not use a data reference.

     

    Quick definitions: a closed event frame is one where the EndTime is assigned, and an open or in-progress event frame is one in which the EndTime is not assigned. Technically, the EndTime always has a value; an open, blank, or seemingly unassigned EndTime is actually equal to AFTime.MaxValue.

     

    For event frames that are NOT in-progress, server-side filtering may be used with "|SITE", but only IF you have captured values. Note value capturing for closed event frames is the default.

     

    If you have not captured values OR the event frame is in-progress, then "|SITE" will be filtered client-side. In which case, a superset of event frames is sent to the client PC, where the AFSearch applies the client-side filters to reduce this superset to the final set, which you can enumerate over INSIDE of your AFSearch block.

     

    You have not shared any particulars about how many total event frames are in your PI System, or how many are used by the requested template. It would be nice to know how many you are truly expecting to be output, as well as a rough guess as to how many are to be sent for client-side filtering. That is, it would be helpful to know how big the superset will be versus the final set.

     

    This help is taken from the REMARKS section from the online help:

     

    <quote>

    To optimize getting items from several pages of search results, a search can be cached by setting the CacheTimeout to a non-zero value. If you will only be getting items from the first page, then it is best to leave the cache disabled. The cache is disabled by default. 

    </quote>

     

    There is a lack of clarity regarding "first page". The number of RPC calls, aka trips to the server, is not determined by the first page of final results, that is event frames that have passed the client-side filter. Rather, it is the first page of the superset of results that have passed the server-side filter. If you think it the search requires an additional trip to fetch more items, then you should employ server-side caching for a nice performance boost. If you think the superset will be less than a full page (default is 1000 items), then you do not need server-side caching.

     

    If you do need server-side caching, this code can be used.

    IEnumerable<AFEventFrame> RetrievedFrames = null;
    using (AFEventFrameSearch search = new AFEventFrameSearch(LogDatabase, "baseSearch", tokens))
    {
        search.CacheTimeout = TimeSpan.FromMinutes(10);
        RetrievedFrames = search.FindObjects(0, true, 0).ToList();
    } // search object is disposed and any server-side cache is disposed.

    I give a very generous cache timeout, but that is okay. Whether your search takes 2 seconds or 2 minutes, the cache will be immediately disposed of thanks to the closing brace on line 6.

     

    So I would suggest you use ToList() and you may want to experiment with using a cache timeout.

     

    Beyond that, there are potential performance benefits to using a "skinny search", that is to say using a FindObjectFields overload. But this would only be worth pursuing if (A) none of the event frames passing server-side filters are in-progress and (B) you have captured values. If neither of those conditions are met, then the skinny search will resort to being a fat one.

     

Children
No Data