How to write a regex which splits the string into an array when https and a trailing space are found in javascript

WeAreDoomed
March 29, 2023
258 views
2 votes
3 Answers

I’m working with strings. The string often contains something like this:

"The site https://example.com contains all the information you need"

So far I have this regex, which splits the string when https? is found

const rgx_link = /(?=https?://)/gi;

So in this case it would split the string into:

arr = ["The site ", "https://example.com contains all the information you need"];

I would like to modify the regex, so it also splits the string if a space is found after the url.

The desired result would look like this:

arr = ["The site ", "https://example.com", " contains all the information you need"];

The new regex "should look" something like this ((?=https?://)(s)), but it doesn’t work.

Any help would be greatly appreciated. Thank you.

const text = "The site https://example.com contains all the information you need";
const rgx_link = /(?=https?://)/gi;
const result = text.split(rgx_link);
console.log(result)

Edit 1: Wiktors suggestion is correct. Didn’t notice that the leading part of the regex was changed, so it didn’t work.

const text = "The site https://example.com contains all the information you need";
const rgx_link = /(https?://S*)/gi;
const result = text.split(rgx_link);
console.log(result)

Tags: javascript regex

Answers

- Gieto
- March 29, 2023 at 4:44 pm
- 0 votes
0
```
const text =
    "The site https://example.com contains all the information you need";
const rgx_link = /(?=https?://)/gi;
const result = text.split(rgx_link);

const url = result[1].split(" ")[0];
const rest = result[1].split(url + " ")[1];

const res = [result[0], url, rest];

console.log(res);
```
Is this what you need?
Login or Signup to reply.

- WiktorStribiew
- March 29, 2023 at 5:02 pm
- 0 votes
0
You can use
```
text.split(/(https?://S*)/i)
```
Note that String#split method outputs all captured substrings when the regex contains capturing groups, so this regex, being entirely wrapped with a capturing group, basically tokenizes the string into URLs and non-URL tokens.

Note the absence of the g flag, since String#split behavior by default is to split on all occurrences of the regex pattern.

Pattern details
- http – a http string
- s? – an optional s char
- :// – a :// substring
- S* – zero or more non-whitespace chars.
Login or Signup to reply.

- GoldenLion
- March 30, 2023 at 11:11 pm
- 0 votes
0
in python I can search for an url then use the span position and use the end value to split the string after the url
```
data="""The site https://example.com contains all the information you need"""

def find_url(text):
    return re.search(r'https?://[^s]*', text)

position=int(find_url(data).span()[1])

print(position)

print(data[0:position],data[position:-1])
```
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.