skip to Main Content

I’m facing an issue with my Symfony application. I have a ManyToMany relation which I need to persist and flush. However when the ManyToMany relation gets too big it becomes very slow on flushing the data to the database.

I’ve tried adding fetch="EXTRA_LAZY" however, this seems to have no effect.

My models look like this (i’ve removed some stuff which does not matter):

Image model

<?php

/**
 * @ORMEntity(repositoryClass=ImageRepository::class)
 * @ORMHasLifecycleCallbacks
 */
class Image
{
    /**
     * @ORMManyToMany(targetEntity=Content::class, inversedBy="images", fetch="EXTRA_LAZY", cascade={"persist"})
     */
    private $content;

    public function __construct()
    {
        $this->content = new ArrayCollection();
    }
    /**
     * @return Collection|Content[]
     */
    public function getContent(): Collection
    {
        return $this->content;
    }
    public function addContent(Content $content): self
    {
        if (!$this->content->contains($content)) {
            $this->content->add($content);
        }

        return $this;
    }
    public function removeContent(Content $content): self
    {
        $this->content->removeElement($content);
        return $this;
    }
}

Content model

<?php
/**
 * @ORMEntity(repositoryClass=ContentRepository::class)
 * @ORMHasLifecycleCallbacks
 */
class Content
{
    /**
     * @ORMManyToMany(targetEntity=Image::class, mappedBy="content", fetch="EXTRA_LAZY", cascade={"persist", "remove"})
     */
    private $images;
    /**
     * @return Collection|Image[]
     */
    public function getImages(): Collection
    {
        return $this->images;
    }
    public function setImages(ArrayCollection $images): void
    {
        $this->images = $images;
    }
    public function addImage(Image $image): self
    {
        if (!$this->images->contains($image)) {
            $this->image->add($image);
            $image->addContent($this);
        }
        return $this;
    }
    public function removeImage(Image $image): self
    {
        if ($this->images->removeElement($image)) {
            $image->removeContent($this);
        }
        return $this;
    }
}

The importing process becomes fast if I uncomment the $this->content[] = $content; in the Image model. If I do not uncomment this line flushing data grows exponentially. I have replaced this with the preferred $this->content->add($content) Doing this haven’t improved performance. (on both sides, for Image and Content).

We’re running Symfony version 5.3.16 and this application runs on a machine with SSD’s for de Database server, I assume this is not a hardware problem.

About the saving the content/image records.
We’re flushing after every insert. Every 150 records we do a:

$this->entityManager->clear();
gc_collect_cycles();

Does someone have an idea to push me in the right direction?

2

Answers


  1. It happens because the calling method
    $this->images->contains($image)
    always load all of your collection to the ProxyObject to check the existence

    https://www.doctrine-project.org/projects/doctrine-collections/en/stable/index.html#contains

    https://www.doctrine-project.org/projects/doctrine-orm/en/2.14/tutorials/extra-lazy-associations.html

    After calling $em->clear() your collection always empty

    try construction like:

    $this->images->containsKey($image->getId())
    

    it will load only the keys instead of all Image objects

    Also check SQL logs in these place

    Login or Signup to reply.
  2. I’ll give you my view of your issue which can be discutable of course and i will be happy if people give counterview.
    I’ve experienced such case in the past, and each time, the conclusion is the same.

    To be honest, this is a typical example where the magic does not happen anymore when using Symfony / Doctrine.
    And this is "normal".

    Those tools are made to cover as much case as possible to improve developer efficiency to focus on making important things inside the codebase and not re-creating existing behavior. It lead to a 80-98% fast feature most of the time.

    BUT

    When performance needs to be improve even if code seems great, you will have to go back to more "manual" code. And yes using Collection feature on large dataset can lead to bad performance.

    From my point of view you have multiple solutions you can try :

    1. Define the join table explicitly

    Instead of having a ManyToMany. Create the JoinTable as an entity.
    You wont have to change your database if you properly create it.
    See : https://symfonycasts.com/screencast/doctrine-relations/complex-many-to-many
    for improved explanation. It simple two OneToMany relationship on your Content and Image entity with a new entity ImageContent with two ManyToOne.

    This way you will be able to directly remove a relation between image and content in a much easier way.
    For example:

    $imageContent = $this->contentRepository->findOneBy(['image' => $image, 'content' => $content]); //very fast query since it used index
    $this->entityManager->remove($imageContent)
    
    // or to avoid multiple query : 
    
     $qb = $this->createQueryBuilder('image_content');
     $qb->delete()->where('image_content.image = :image')
     ->andWhere('image_content.content = :content')
     ->setParameter('image', $image)
     ->setParameter('content', $content);
     
     $qb->getQuery()->execute();
    

    And when adding

    $imageContent = $this->contentRepository->findOneBy(['image' => $image, 'content' => $content]); //very fast query since it used index but you could do a count query to improve it more
    if($imageContent){
       return; // so no duplicate occur;
    }
    $imageContent = new ImageContent($image, $content);
    $this->entityManager->persist($imageContent);
    $this->entityManager->flush();
    

    This way you never have to play with the collection

    Just be carefull to often clear entity manager while importing large dataset, look at my answer here : Getting a managed and dirty entity error on persist
    for more precision about this, you fill some solutions to handle large dataset also.

    2. Use pure SQL

    If you do not want to try the upper solution. Just add or remove association in pure SQL called from repository. Something like that :

        $sql = "
        DELETE FROM image_content ic
        WHERE ic.image_id = :param1 AND ic.content_id = :param2
        ";
    
        $statement = $this->_em->getConnection()->prepare($sql);
    
        return $statement->executeQuery([
            'param1' => $param1,
            'param2' => $param2,
        ]);
    

    Then you can do the same for insert also.

    Well i hope it helped. Don’t hesitate to ask question in comment.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search