Hello folks,
Just wanted to share a few thoughts I have.
With a few big api platform I see some bad tendencies happening again and again that totally crashes the performance in both dev and prod env. In dev env it's quite difficult to live with.
1/ custom normalizers are rarely using the CacheableSupportsMethodInterface interface.
Normalization : we have a fully hydrated entity, it's easy to cache with some instanceof and an interface.
Denormalization : we have an array of yet unknown data and a string representing the class to hydrate. We use class_implements
on the string classname to check if it's cacheable but it's not as nice.
If we are not able to use cache on denormalization, it's better to separate the normalizer and the denormalizer in two different classes so the normalizer can be cacheable even if the denormalizer is not.
It's really really important that the normalizer to be cacheable.
On our application, with a critical route we have 36000 getNormalizer
calls. 4 normalizers were not cachable, so it's 360004 supportNormmosation
calls. Even for simple stuff as native- stuff.
Of course this route is cached in prod mode, but it's not in dev mode
suggestion : it's so critical that perhaps it should be visible somewhere in the profiler.
2/ IriConverter::getIriFromItem is costly when it's called.
What happens?
- on normalization, we arrive on
ApiPlatform\Core\Serializer\ItemNormalizer
so on ApiPlatform\Core\Serializer\AbstractItemNormalizer::normalize
- as
$context['resources']
seems to be always set, i calls $resource = $context['iri'] ?? $this->iriConverter->getIriFromItem($object);
- it does two things:
- find the identifier (in our case "id" 100% on the time). This is the worst part. For each call, it goes straight to phpdocumentor on the file itself just to be able to extract "hey the identifier property is 'id'".
- generate the route via the router
As we have a react app with multiples http calls to the api backend, it creates a concurence on the file read. As we are on mac os x with an nfs and very poor access disk performance, every filesystem call cost us a lot.
There is multiple layer of caching that are all disabled in test/dev mode via the ApiPlatformExtension::registerCacheConfiguration
. We can set cache back by enabling the api_platform.metadata_cache
parameter to true in our app. But it enables the cache for every part of the stack. If we are in dev mode it's because we touch to the resource & serialization config a lot. But we NEVER change the identifier property. The cache removal is a very big on/off performance setback.
Furthermore, on my demo call, I have 320 differents entities in my response, but some of them is used a lot of time. It loads and extract with phpdoc 1590 times. I supposed it's the role of ApiPlatform\Core\Api\CachedIdentifiersExtractor::getKeys
but i didn't see it working. I will check.
patch: static cache on IriConverter::getIriFromItem
1590 => 320 calls
2/ IriConverter::getIriFromItem is called ALL. THE. TIME. Fun things is: there is no iri routes in my response. So we compute it for nothing (in our case). Did I miss something where it's useful ? Removing it give us the exact same json response whlie giving us a real performance gain.
patch: remove forced call to the IriConverter in ApiPlatform\Core\Serializer\AbstractItemNormalizer::normalize
320 calls => 19 (generated by cyclic call and max depth config)
// Questions
- why iri is computed all the time? wan we safely remove it?
- can we set the identifier once and for all in the resource config itself to avoid going to phpdocumentor?
Do you have any advice / information / recommandation ?
Thanks for reading.
performance