eZ Publish Fetch Functions Optimization

by Serhey Dolgushev | May 14, 2012 10:51 am

Each time there’s a need to get a list of articles/users/comments in eZ Publish, we use fetch functions. Fetch function can be used in a template (http://doc.ez.no/eZ-Publish/Technical-manual/4.x/Reference/Modules/content/Fetch-functions/list[1], http://doc.ez.no/eZ-Publish/Technical-manual/4.x/Reference/Modules/content/Fetch-functions/tree[2]) and PHP code (https://github.com/ezsystems/ezpublish/blob/master/kernel/classes/ezcontentobjecttreenode.php#L1837[3]). That’s why problems may arise, when you need to fetch a lot of nodes.


Below is a simple example of using eZ Publish fetch functions: command line script which fetches all published nodes. Here is the source code:


Source code [4] [5] [6] 
  1. <?php
  2. set_time_limit( 0 );
  3. ini_set( 'memory_limit', '2048M' );
  4.  
  5. require 'autoload.php';
  6.  
  7. $cli = eZCLI::instance();
  8. $cli->setUseStyles( true );
  9.  
  10. $scriptSettings = array();
  11. $scriptSettings['description']    = 'Fetches all nodes and iterates them';
  12. $scriptSettings['use-session']    = true;
  13. $scriptSettings['use-modules']    = true;
  14. $scriptSettings['use-extensions'] = true;
  15. $scriptSettings['site-access']    = 'siteadmin';
  16.  
  17. $script = eZScript::instance( $scriptSettings );
  18. $script->startup();
  19. $script->initialize();
  20.  
  21. $cli->output( str_repeat( '-', 64 ) );
  22. $cli->output( 'Starting script...' );
  23. $cli->output( str_repeat( '-', 64 ) );
  24.  
  25. $startTime = microtime( true );
  26.  
  27. $nodes = eZContentObjectTreeNode::subTreeByNodeID(
  28.     array(
  29.         'Depth'       => false,
  30.         'Limitation'  => array(),
  31.         'LoadDataMap' => true
  32.     ),
  33.     1
  34. );
  35.  
  36. $count = count( $nodes );
  37. foreach( $nodes as $key => $node ) {
  38.     $object = $node->attribute( 'object' );
  39.     if( $object instanceof eZContentObject === false ) {
  40.         continue;
  41.     }
  42.  
  43.     $dataMap = $object->attribute( 'data_map' );
  44.  
  45.     if( $key % 100 === 0 ) {
  46.         $memoryUsage = number_format( memory_get_usage( true ) / ( 1024 * 1024 ), 2 );
  47.         $output = number_format( $key / $count * 100, 2 ) . '% (' . ( $key + 1 ) . '/' . $count . ')';
  48.         $output .= ', Memory usage: ' . $memoryUsage . ' Mb';
  49.         $cli->output( $output );
  50.     }
  51. }
  52.  
  53. $executionTime = round( microtime( true ) - $startTime, 2 );
  54.  
  55. $cli->output( str_repeat( '-', 64 ) );
  56. $cli->output( 'Script took ' . $executionTime . ' secs.' );
  57. $cli->output( str_repeat( '-', 64 ) );
  58.  
  59. $script->shutdown( 0 );
  60.  
  61. ?>

After running the script we got the following result:

Source code [7] [8] [6] 
$ php extension/nxc_test/bin/php/fetch_many_nodes.php
----------------------------------------------------------------
Starting script...
----------------------------------------------------------------
0.00% (1/1384), Memory usage: 63.50 Mb
7.23% (101/1384), Memory usage: 63.50 Mb
14.45% (201/1384), Memory usage: 63.50 Mb
21.68% (301/1384), Memory usage: 63.50 Mb
28.90% (401/1384), Memory usage: 63.50 Mb
36.13% (501/1384), Memory usage: 63.50 Mb
43.35% (601/1384), Memory usage: 63.50 Mb
50.58% (701/1384), Memory usage: 63.50 Mb
57.80% (801/1384), Memory usage: 63.50 Mb
65.03% (901/1384), Memory usage: 63.50 Mb
72.25% (1001/1384), Memory usage: 63.50 Mb
79.48% (1101/1384), Memory usage: 63.50 Mb
86.71% (1201/1384), Memory usage: 63.50 Mb
93.93% (1301/1384), Memory usage: 63.50 Mb
----------------------------------------------------------------
Script took 2.95 secs.
----------------------------------------------------------------

If your eZ Publish installation contains tens of thousands of published nodes then most likely this script will fail. And it is because all nodes are fetched together with its data map. Because of this, the executable SQL query will be too large and complicated, and most likely your MySQL server will be overloaded while this query is being executed. So you should think very well if you plan to use LoadDataMap option in the ez publish functions without specified limitation.

Let’s try to run this script with disabled LoadDataMap option:

Source code [9] [10] [6] 
  1. $nodes = eZContentObjectTreeNode::subTreeByNodeID(
  2.     array(
  3.         'Depth'       => false,
  4.         'Limitation'  => array(),
  5.         'LoadDataMap' => false
  6.     ),
  7.     1
  8. );

 

Results:

Source code [11] [12] [6] 
$ php extension/nxc_test/bin/php/fetch_many_nodes_as_object.php
----------------------------------------------------------------
Starting script...
----------------------------------------------------------------
0.00% (1/1384), Memory usage: 22.50 Mb
7.23% (101/1384), Memory usage: 22.50 Mb
14.45% (201/1384), Memory usage: 23.75 Mb
21.68% (301/1384), Memory usage: 26.75 Mb
28.90% (401/1384), Memory usage: 29.00 Mb
36.13% (501/1384), Memory usage: 31.50 Mb
43.35% (601/1384), Memory usage: 33.75 Mb
50.58% (701/1384), Memory usage: 36.00 Mb
57.80% (801/1384), Memory usage: 38.25 Mb
65.03% (901/1384), Memory usage: 40.75 Mb
72.25% (1001/1384), Memory usage: 43.25 Mb
79.48% (1101/1384), Memory usage: 45.75 Mb
86.71% (1201/1384), Memory usage: 48.25 Mb
93.93% (1301/1384), Memory usage: 51.25 Mb
----------------------------------------------------------------
Script took 2.16 secs.
----------------------------------------------------------------

In this case, it is not difficult to see that memory usage increases with each iteration. Again, if your eZ Publish installation contains a lot of published nodes, then this script will probably fail because it does not have enough memory. In order to avoid this kind “memory leaks” you need to clean up the cache and reset objects’  data map at the end of each iteration. This can be done with the following code:

Source code [13] [14] [6] 
eZContentObject::clearCache( $object->attribute( 'id' ) );
$object->resetDataMap();

Each time eZ Publish object is fetched, its data is stored in eZContentObjectContentObjectCache static variable (https://github.com/ezsystems/ezpublish/blob/master/kernel/classes/ezcontentobject.php#L833[15]). eZContentObject::clearCache
clears the eZContentObjectContentObjectCache variable.

Let’s clear cache and reset object data map at the end of each iteration:

Source code [16] [17] [6] 
  1. foreach( $nodes as $key => $node ) {
  2.     $object = $node->attribute( 'object' );
  3.     if( $object instanceof eZContentObject === false ) {
  4.         continue;
  5.     }
  6.  
  7.     $dataMap = $object->attribute( 'data_map' );
  8.  
  9.     eZContentObject::clearCache( $object->attribute( 'id' ) );
  10.     $object->resetDataMap();
  11.  
  12.     if( $key % 100 === 0 ) {
  13.         $memoryUsage = number_format( memory_get_usage( true ) / ( 1024 * 1024 ), 2 );
  14.         $output = number_format( $key / $count * 100, 2 ) . '% (' . ( $key + 1 ) . '/' . $count . ')';
  15.         $output .= ', Memory usage: ' . $memoryUsage . ' Mb';
  16.         $cli->output( $output );
  17.     }
  18. }

And run the script:

Source code [18] [19] [6] 
$ php extension/nxc_test/bin/php/fetch_many_nodes_as_object_clear_memory.php
----------------------------------------------------------------
Starting script...
----------------------------------------------------------------
0.00% (1/1384), Memory usage: 22.50 Mb
7.23% (101/1384), Memory usage: 22.50 Mb
14.45% (201/1384), Memory usage: 22.50 Mb
21.68% (301/1384), Memory usage: 22.50 Mb
28.90% (401/1384), Memory usage: 22.50 Mb
36.13% (501/1384), Memory usage: 22.50 Mb
43.35% (601/1384), Memory usage: 22.50 Mb
50.58% (701/1384), Memory usage: 22.50 Mb
57.80% (801/1384), Memory usage: 22.50 Mb
65.03% (901/1384), Memory usage: 22.50 Mb
72.25% (1001/1384), Memory usage: 22.50 Mb
79.48% (1101/1384), Memory usage: 22.50 Mb
86.71% (1201/1384), Memory usage: 22.50 Mb
93.93% (1301/1384), Memory usage: 22.50 Mb
----------------------------------------------------------------
Script took 2.46 secs.
----------------------------------------------------------------

It’s much better than last time. But still, we can spend even less memory. $nodes array contains eZContentObjectTreeNode objects. If the $nodes array will contain arrays instead of eZContentObjectTreeNode objects, then we need much less memory to store $nodes array. AsObject option allows us to achieve the desired results:

Source code [20] [21] [6] 
  1. $nodes = eZContentObjectTreeNode::subTreeByNodeID(
  2.     array(
  3.         'Depth'       => false,
  4.         'Limitation'  => array(),
  5.         'LoadDataMap' => false,
  6.         'AsObject'    => false
  7.     ),
  8.     1
  9. );
  10.  
  11. $count = count( $nodes );
  12. foreach( $nodes as $key => $node ) {
  13.     $object = eZContentObject::fetch( $node['contentobject_id'] );
  14.     if( $object instanceof eZContentObject === false ) {
  15.         continue;
  16.     }
  17.  
  18.     $dataMap = $object->attribute( 'data_map' );
  19.  
  20.     eZContentObject::clearCache( $object->attribute( 'id' ) );
  21.     $object->resetDataMap();
  22.  
  23.     if( $key % 100 === 0 ) {
  24.         $memoryUsage = number_format( memory_get_usage( true ) / ( 1024 * 1024 ), 2 );
  25.         $output = number_format( $key / $count * 100, 2 ) . '% (' . ( $key + 1 ) . '/' . $count . ')';
  26.         $output .= ', Memory usage: ' . $memoryUsage . ' Mb';
  27.         $cli->output( $output );
  28.     }
  29. }

Results:

Source code [22] [23] [6] 
$ php extension/nxc_test/bin/php/fetch_many_nodes_as_array_clear_memory.php
----------------------------------------------------------------
Starting script...
----------------------------------------------------------------
0.00% (1/1384), Memory usage: 16.50 Mb
7.23% (101/1384), Memory usage: 16.75 Mb
14.45% (201/1384), Memory usage: 16.75 Mb
21.68% (301/1384), Memory usage: 16.75 Mb
28.90% (401/1384), Memory usage: 16.75 Mb
36.13% (501/1384), Memory usage: 16.75 Mb
43.35% (601/1384), Memory usage: 16.75 Mb
50.58% (701/1384), Memory usage: 16.75 Mb
57.80% (801/1384), Memory usage: 16.75 Mb
65.03% (901/1384), Memory usage: 16.75 Mb
72.25% (1001/1384), Memory usage: 16.75 Mb
79.48% (1101/1384), Memory usage: 16.75 Mb
86.71% (1201/1384), Memory usage: 16.75 Mb
93.93% (1301/1384), Memory usage: 16.75 Mb
----------------------------------------------------------------
Script took 2.3 secs.
----------------------------------------------------------------

In this case, memory is not used much less. But the more fetch nodes you have, the bigger the difference is.

I hope that this post was useful to you. Maybe you can suggest in the comments some different ways for performance optimization?

Endnotes:
  1. http://doc.ez.no/eZ-Publish/Technical-manual/4.x/Reference/Modules/content/Fetch-functions/list: http://doc.ez.no/eZ-Publish/Technical-manual/4.x/Reference/Modules/content/Fetch-functions/list
  2. http://doc.ez.no/eZ-Publish/Technical-manual/4.x/Reference/Modules/content/Fetch-functions/tree: http://doc.ez.no/eZ-Publish/Technical-manual/4.x/Reference/Modules/content/Fetch-functions/tree
  3. https://github.com/ezsystems/ezpublish/blob/master/kernel/classes/ezcontentobjecttreenode.php#L1837: https://github.com/ezsystems/ezpublish/blob/master/kernel/classes/ezcontentobjecttreenode.php#L1837
  4. [Image]: #codesyntax_1
  5. [Image]: #codesyntax_1
  6. [Image]: http://blog.nxcgroup.com/wp-content/plugins/wp-synhighlight/About.html
  7. [Image]: #codesyntax_2
  8. [Image]: #codesyntax_2
  9. [Image]: #codesyntax_3
  10. [Image]: #codesyntax_3
  11. [Image]: #codesyntax_4
  12. [Image]: #codesyntax_4
  13. [Image]: #codesyntax_5
  14. [Image]: #codesyntax_5
  15. https://github.com/ezsystems/ezpublish/blob/master/kernel/classes/ezcontentobject.php#L833: https://github.com/ezsystems/ezpublish/blob/master/kernel/classes/ezcontentobject.php#L833
  16. [Image]: #codesyntax_6
  17. [Image]: #codesyntax_6
  18. [Image]: #codesyntax_7
  19. [Image]: #codesyntax_7
  20. [Image]: #codesyntax_8
  21. [Image]: #codesyntax_8
  22. [Image]: #codesyntax_9
  23. [Image]: #codesyntax_9

Source URL: http://blog.nxcgroup.com/2012/ez-publish-fetch-functions-optimization/