skip to Main Content

I have a huge amount of data imported in my Typesense using Laravel command scout:import. And I need to stop after some time and continue where it left.

I found this: https://laracasts.com/discuss/channels/laravel/skip-some-records-with-scoutimport but it always gives me All [AppModelsUser] records have been imported. even after trying other id and query.

Example User model:

public function searchableAs(): string
{
    return 'users_index';
}

public function toSearchableArray()
{
    return array_merge($this->toArray(),[
        'id' => (string) $this->id,
        'firstName' => (string) $this->firstName,
        'lastName' => (string) $this->lastName,
        'middle_name' => (string) $this->middle_name,
        'active' => (boolean) $this->active,
    ]);
}

protected function makeAllSearchableUsing($query)
{
    return $query->where('id', '>', '51433das1e');
}

2

Answers


  1. What you can do it to create a cronjob which will continue the import process, and send a count as a parameter for that job, for example :

    php artisan scout:bulk-import --limit=500
    

    So it will take 500 records every time, and you can add a column to a table any table you like, for example users table a column named, last_imported_id;

    in the cron job you can continue after the last_imported_id;

    Login or Signup to reply.
  2. makeAllSearchableUsing() may not be applicable when batch importing using a queue.

    The cleanest way to start a batch import of models starting from a specific position is by chaining searchable() after your desired query:

    User::where('id', '>', 500)->orderBy('id')->searchable()
    

    This will only import the models that are returned by the query. Note that a orderBy clause is required to properly return only the "next batch of models".

    If desired, you could create a Job or Command and make the query easily callable and configurable, e.g.:

    use IlluminateConsoleCommand;
    
    class ContinueUserImportStartingFromId extends Command
    {
        protected $signature = 'user:import-continue {id}';
    
        public function handle(): void
        {
            $id = $this->argument('id');
    
            User::where('id', '>', $id)->orderBy('id')->searchable();
    
            return 0;
        }
    }
    

    That’s the "straightforward" part. Now, I noticed that your your id is a hash-like string (51433das1e), meaning that you won’t be able to scope your query with greater/lesser than comparison on the id attribute (where('id', '>', ...)). If you have another column featuring (unique) incremental values, you could use that and you’re good to go.

    Otherwise you could also switch to using the created_at datetime attribute if that’s available on your model. Take note of the last imported model’s creation date. Warning: it might occur that there are multiple models having an equal creation date, and the import happens to be stopped right in the middle of these models. So I’d suggest to start the import again with a creation date of a minute earlier. You’ll be importing some models again (which is fine), but this way you’ll also ensure you’re not missing out any.

    Hope this answers your question.

    References:

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search