I have a huge amount of data imported in my Typesense using Laravel command scout:import
. And I need to stop after some time and continue where it left.
I found this: https://laracasts.com/discuss/channels/laravel/skip-some-records-with-scoutimport but it always gives me All [AppModelsUser] records have been imported.
even after trying other id
and query.
Example User model:
public function searchableAs(): string
{
return 'users_index';
}
public function toSearchableArray()
{
return array_merge($this->toArray(),[
'id' => (string) $this->id,
'firstName' => (string) $this->firstName,
'lastName' => (string) $this->lastName,
'middle_name' => (string) $this->middle_name,
'active' => (boolean) $this->active,
]);
}
protected function makeAllSearchableUsing($query)
{
return $query->where('id', '>', '51433das1e');
}
2
Answers
What you can do it to create a cronjob which will continue the import process, and send a count as a parameter for that job, for example :
So it will take 500 records every time, and you can add a column to a table any table you like, for example users table a column named, last_imported_id;
in the cron job you can continue after the last_imported_id;
makeAllSearchableUsing()
may not be applicable when batch importing using a queue.The cleanest way to start a batch import of models starting from a specific position is by chaining
searchable()
after your desired query:This will only import the models that are returned by the query. Note that a
orderBy
clause is required to properly return only the "next batch of models".If desired, you could create a Job or Command and make the query easily callable and configurable, e.g.:
That’s the "straightforward" part. Now, I noticed that your your
id
is a hash-like string (51433das1e), meaning that you won’t be able to scope your query with greater/lesser than comparison on theid
attribute (where('id', '>', ...)
). If you have another column featuring (unique) incremental values, you could use that and you’re good to go.Otherwise you could also switch to using the
created_at
datetime attribute if that’s available on your model. Take note of the last imported model’s creation date. Warning: it might occur that there are multiple models having an equal creation date, and the import happens to be stopped right in the middle of these models. So I’d suggest to start the import again with a creation date of a minute earlier. You’ll be importing some models again (which is fine), but this way you’ll also ensure you’re not missing out any.Hope this answers your question.
References: