I am looking for a way faster solution to split a string into an array of values. The string has unfortunately ;
as a delimited character. Split seems to be quite slow when it comes for a set of > 1k strings and in-memory based solutions.
my (@array) = split(';', $string);
Is there any way to speed up Perl here (sort of workaround using unpack
etc?)
Update:
Approx. 1500 @members
take 0.6 – 0.8 seconds. (measured within the foreach
) from Having some dummy invalid $ref
(without split), it’s like real time. Maybe the $ref
/$redis
fills the time up? (using RedisDB
)
Some code:
my $ref = $redis->execute("MGET", @members);
foreach my $i (@members) { $counter++;
my @result = split(';', $ref->[$counter]);
# approx. 30 comparisons/operations like:
# if($result[0] == 1 && $result[7] == 1) {...}
}
3
Answers
If you do not need all the fields, use
LIMIT
, as in:split /PATTERN/,EXPR,LIMIT
. For example, this splits into 2 fields instead of as many as there are (I also removed the extra parens):Related to the above: According to
perldoc -f split
, one of the way to make it faster is to split into only as many fields as needed (and avoid splitting into an array without aLIMIT
):You could try implement it in XS, for example:
I am not sure how much faster this will be yet..
Using another delimiter is unlikely to help. Below is a benchmark that compares semicolon, blank, and the
NULL
character as delimiters. The speed is the same within the error of the benchmarking, regardless of the field delimiter used. Your speedup may have to come from the code used outside of thesplit
.Benchmark: