I have to update my Doctrine entities to match records inside of (potentionaly very large) XML file. I have to also update ManyToMany associations according to data in the XML. This is what I do inside of a loop:
- get data from XML
- get entity from DB (if does not exist create new)
- set new entity properties
- get current entity associations (getter returns
ArrayCollection
object) - clear all associations (by calling the
ArrayCollection::clear()
) - set new associations (by calling
ArrayCollection::add()
in sub-loop) - persist entity by EntityManager
After the loop I call EntityManager::flush()
.
The problem is that flushing generates large amount of queries instead of updating/inserting/deleting multiple rows at once. For every entity are executed following queries:
- SELECT to get entity from DB
- UPDATE to update entity properties (this is actually skipped now as no properties changed ... yet)
- DELETE to clear previous associations
- INSERT to insert new associations
So in total for 305 records in XML i get 915 queries (I guess it could go up to 1220 queries if all entities would changed) which makes the import very slow.
I could take advantage of IdentityMap and pre-fetch entities before loop, but there are still the UPDATE/DELETE/INSERT queries.
- Is there a way to let the flush method better optimize queries (use multi-insert, WHERE IN instead of multiple DELETE queries, etc.)?
- Is this normal behaviour of flush method or am I doing something wrong?
- Perhaps there is problem in the way how I update the associations of entity. Is there better way how to do this? (instead of "get/clear/add" method)
- I am aware of that Doctrine is not intended for mass betch processing, but I think using it for XML imports is the best way how to avoid DB inconsitencies which could appear with a not-ORM approach. Is that right?
- If the approach above is wrong, how should I solve the problem?