I’m facing an issue with my Symfony application. I have a ManyToMany relation which I need to persist and flush. However when the ManyToMany relation gets too big it becomes very slow on flushing the data to the database.
I’ve tried adding fetch="EXTRA_LAZY"
however, this seems to have no effect.
My models look like this (i’ve removed some stuff which does not matter):
Image model
<?php
/**
* @ORMEntity(repositoryClass=ImageRepository::class)
* @ORMHasLifecycleCallbacks
*/
class Image
{
/**
* @ORMManyToMany(targetEntity=Content::class, inversedBy="images", fetch="EXTRA_LAZY", cascade={"persist"})
*/
private $content;
public function __construct()
{
$this->content = new ArrayCollection();
}
/**
* @return Collection|Content[]
*/
public function getContent(): Collection
{
return $this->content;
}
public function addContent(Content $content): self
{
if (!$this->content->contains($content)) {
$this->content->add($content);
}
return $this;
}
public function removeContent(Content $content): self
{
$this->content->removeElement($content);
return $this;
}
}
Content model
<?php
/**
* @ORMEntity(repositoryClass=ContentRepository::class)
* @ORMHasLifecycleCallbacks
*/
class Content
{
/**
* @ORMManyToMany(targetEntity=Image::class, mappedBy="content", fetch="EXTRA_LAZY", cascade={"persist", "remove"})
*/
private $images;
/**
* @return Collection|Image[]
*/
public function getImages(): Collection
{
return $this->images;
}
public function setImages(ArrayCollection $images): void
{
$this->images = $images;
}
public function addImage(Image $image): self
{
if (!$this->images->contains($image)) {
$this->image->add($image);
$image->addContent($this);
}
return $this;
}
public function removeImage(Image $image): self
{
if ($this->images->removeElement($image)) {
$image->removeContent($this);
}
return $this;
}
}
The importing process becomes fast if I uncomment the $this->content[] = $content;
in the Image model. If I do not uncomment this line flushing data grows exponentially. I have replaced this with the preferred $this->content->add($content)
Doing this haven’t improved performance. (on both sides, for Image
and Content
).
We’re running Symfony version 5.3.16 and this application runs on a machine with SSD’s for de Database server, I assume this is not a hardware problem.
About the saving the content/image records.
We’re flushing after every insert. Every 150 records we do a:
$this->entityManager->clear();
gc_collect_cycles();
Does someone have an idea to push me in the right direction?
2
Answers
It happens because the calling method
$this->images->contains($image)
always load all of your collection to the ProxyObject to check the existence
https://www.doctrine-project.org/projects/doctrine-collections/en/stable/index.html#contains
https://www.doctrine-project.org/projects/doctrine-orm/en/2.14/tutorials/extra-lazy-associations.html
After calling
$em->clear()
your collection always emptytry construction like:
it will load only the keys instead of all Image objects
Also check SQL logs in these place
I’ll give you my view of your issue which can be discutable of course and i will be happy if people give counterview.
I’ve experienced such case in the past, and each time, the conclusion is the same.
To be honest, this is a typical example where the magic does not happen anymore when using Symfony / Doctrine.
And this is "normal".
Those tools are made to cover as much case as possible to improve developer efficiency to focus on making important things inside the codebase and not re-creating existing behavior. It lead to a 80-98% fast feature most of the time.
BUT
When performance needs to be improve even if code seems great, you will have to go back to more "manual" code. And yes using
Collection
feature on large dataset can lead to bad performance.From my point of view you have multiple solutions you can try :
1. Define the join table explicitly
Instead of having a ManyToMany. Create the JoinTable as an entity.
You wont have to change your database if you properly create it.
See : https://symfonycasts.com/screencast/doctrine-relations/complex-many-to-many
for improved explanation. It simple two OneToMany relationship on your Content and Image entity with a new entity ImageContent with two ManyToOne.
This way you will be able to directly remove a relation between image and content in a much easier way.
For example:
And when adding
This way you never have to play with the collection
Just be carefull to often clear entity manager while importing large dataset, look at my answer here : Getting a managed and dirty entity error on persist
for more precision about this, you fill some solutions to handle large dataset also.
2. Use pure SQL
If you do not want to try the upper solution. Just add or remove association in pure SQL called from repository. Something like that :
Then you can do the same for insert also.
Well i hope it helped. Don’t hesitate to ask question in comment.