I am writing a regular expression for capturing the connection string of PostgreSQL. The format of the connection string as follows:
(postgresql|postgres)://[userspec@][hostspec][/dbname][?paramspec]
where userspec is: user[:password]
and hostspec is: [host][:port][,...]
and paramspec is: name=value[&...]
Here, the userspec, port, dbname and paramspec are optional.
The examples of the connection strings are as follows:
postgresql://localhost
postgresql://localhost:5433
postgresql://localhost/mydb
postgresql://user@localhost
postgresql://user:secret@localhost
postgresql://other@localhost/otherdb?connect_timeout=10&application_name=myapp
postgresql://host1:123,host2:456/somedb?application_name=myapp
postgres://myuser:[email protected]:5432/mydatabase
postgresql://192.168.0.100/mydb
I tried to formulate the below regular expression to capture the connection string and in addition capture the hostspec in a capturing group.
(postgresql|postgres)://((?:[^:@]*)(?::[^@]*)?@{0,1})?(?<server>[^/?]+)b
However, the regex could not capture properly when userspec is not present. The regex can be found here.
Can you please point out on how to avoid the greedy evaluation of userspec and find the hostspec in each line?
2
Answers
Following are some corrections done on your regex to make it work as you expected.
(?:[^:@]*)
can be simplified to[^:@]*
. You don’t need to put it within brackets and then make it non-group using?:
if you aren’t doing anything with it as a group. Also addeds
within it so it doesn’t crawl to eat any newlines(?::[^@]*)?
changed to(?::[^@s]*)?
to includes
for above reason@{0,1}
changed to@
as its needed. Also, you can write@{0,1}
as simply@?
[^/?]+
changed to[^/?s]+
to again includes
And with the above changes it seems to be working like you expected.
Updated Regex Demo
Let me know if this works for you.
Another solution (regex101):