skip to Main Content

I’m trying to do something fun with twitter API,
I want to search on twitter with #np (now playing) tag and split tweet by song name and artist name.

lets assume that it found this tweet

“Listen to It Will Happen by L.E.L #np on #SoundCloud”

I want to take song name, song artist and bind to variables.

and tweet can be something like this too

“just awesome 😀 #np Zombie (metal cover by Leo & Stine Moracchioli) https://youtu.be/4e4bAsQ4r30 via @YouTube”

I’m getting trouble with understanding regex so can somebody show me a proper way to do this 2 example?

2

Answers


  1. There’s hardly a pattern to find in random user input.
    However, if most of it code generated by some source then there’s often still find a pattern to be matched.

    It probably needs to be separated by the source.
    Since it makes it easier to deal with the capture groups.

    var tweetString = "Listen to It Will Happen by L.E.L #np on #SoundCloud";
    var myRegexp = /^(.*)(?: by (.*))#w+.* on #(w+)$/;
    var song = "";
    var artists = "";
    var messagesource = "";
    match = myRegexp.exec(tweetString);
    if (match != null) {
      song = match[1];
      artist = match[2];
      messagesource = match[3];
      console.log("song: " + song);
      console.log("artist: " + artist);
      console.log("messagesource: " + messagesource);
    }
    var tweetString = "just awesome :D #np Zombie (metal cover by Leo & Stine Moracchioli) https://youtu.be/4e4bAsQ4r30 via @YouTube";
    var myRegexp = /^.*#w+ (.*?)(.* by (.*)).* via @(w+)$/;
    var song = "";
    var artists = "";
    var messagesource = "";
    match = myRegexp.exec(tweetString);
    if (match != null) {
      song = match[1];
      artist = match[2];
      messagesource = match[3];
      console.log("song: " + song);
      console.log("artist: " + artist);
      console.log("messagesource: " + messagesource);
    }
    Login or Signup to reply.
  2. An explanation for @LukStorms:

    • ^ start of string
    • .* the . matches any character except new lines (n). The * means the previous part should be there 0 more more times
    • # literal # character
    • w+ the w matches any letter, uppercase or lower case (or underscores, but that usually doesn’t matter); + means the previous part (w) should be there 1 or more times
    • (.*?) the brackets wrap around a capturing group (which you can actually access). In this case, the group would match
    • ( a literal ( character. The “escapes” the next character, turning it special, or making it unspecial ;P
    • .* 0 or more non-newline characters
    • by literal text
    • (.*) a capturing group containing 0 or more non-newline characters
    • ) literal )
    • .* 0 or more non-newline characters
    • via @ literal text
    • (w+) a capturing group containing one or more letters
    • $ end of string

    Hope this helps. If you’re trying to figure this kind of stuff out, or the flow of logic in a regex, you could use regex101, as @LukStorms said. Or something I use is [regexper](https://regexper.com/#%5E.%23%5Cw%2B%20(.%3F)%5C(.%20by%20(.)%5C).*%20via%20%40(%5Cw%2B)%24).

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search