skip to Main Content

I am working on a script using Twitter’s API and I am trying to find matches to exact phrases.

The API however doesn’t allow queries for exact phrases so I’ve been trying to find a workaround however I am getting results that contain words from the phrases but not in the exact match as the phrase.

var search_terms = "buy now, look at this meme, how's the weather?"; 

let termSplit = search_terms.toLowerCase();    
let termArray = termSplit.split(', ');
//["buy now", "look at this meme", "how's the weather?"];
    
    client.stream('statuses/filter', { track: search_terms }, function (stream) {
    console.log("Searching for tweets...");
       stream.on('data', function (tweet) { 
        if(termArray.some(v => tweet.text.toLowerCase().includes(v.toLowerCase()) )){
          //if(tweet.text.indexOf(termArray) > 0 )
            console.log(tweet);
        }
      });
    });

Expected results should be a tweet with any text as long as it contains the exact phrase somewhere.

The results I am getting returns tweets that have an array value present but not an exact phrase match of the value.

Example results being returned –
"I don’t know why now my question has a close request but I don’t buy it."

Example results I am expecting –
"If you like it then buy now."

What am I doing wrong?

2

Answers


  1. You could try using regular expressions. Here’s an example of a regular expression search for a phrase. It returns a positive number (the character where the match started) if there is a match, and -1 otherwise. I return the whole phrase if there is a match.

    You can use quite sophisticated grammar’s for matching particular phrases of interest, I’m just using simple words in this example.

    regular_expression

    Login or Signup to reply.
  2. First, toward the future:

    Twitter is planning to deprecate the statuses/filter v1.1 endpoint:

    These features will be retired in six months on October 29, 2022.

    Additionally, beginning today, new client applications will not be able to gain access to v1.1 statuses/sample and v1.1 statuses/filter. Developers with client apps already using these endpoints will maintain access until the functionality is retired. We are not retiring v1.1 statuses/filter in 6-months, only the ability to retrieve compliance messages. We will retire the full endpoint eventually.

    So, now is a great time to start using the equivalent v2 API, Filtered Stream, which supports exact phrase matching, helping you avoid this entire scenario in your application code.


    With that out of the way, below I’ve included a minimal, reproducible example for you to consider which demonstrates how to match exact phrases in streamed tweets, and even extract additional useful information (like which phrase was used to match it and at what index within the tweet text). It includes inline comments explaining things line-by-line:

    <script type="module">
    
    // Transform to lowercase, split on commas, and trim whitespace
    // on the ends of each phrase, removing empty phrases
    function getPhrasesFromTrackText (trackText) {
      return trackText.toLowerCase().split(',')
        .map(str => str.trim())
        .filter(Boolean);
    }
    
    const trackText = `buy now, look at this meme, how's the weather?`;
    const phrases = getPhrasesFromTrackText(trackText);
    
    // The callback closure which will be invoked with each matching tweet
    // from the streaming response data
    const handleTweet = (tweet) => {
      // Transform the tweet text once
      const lowerCaseText = tweet.text.toLowerCase();
    
      // Create a variable to store the first matching phrase that is found
      let firstMatchingPhrase;
      for (const phrase of phrases) {
        // Find the index of the phrase in the tweet text
        const index = lowerCaseText.indexOf(phrase);
        // If the phrase isn't found, immediately continue
        // to the next loop iteration, skipping the rest of the code block
        if (index === -1) continue;
        // Else, set the match variable
        firstMatchingPhrase = {
          index,
          text: phrase,
        };
        // And stop iterating the other phrases by breaking out of the loop
        break;
      }
    
      if (firstMatchingPhrase) {
        // There was a match; do something with the tweet and/or phrase
        console.log({
          firstMatchingPhrase,
          tweet,
        });
      }
    };
    
    // The Stack Overflow code snippet runs in a browser and doesn't have access to
    // the Node.js Twitter "client" in your question,
    // but you'd use the function like this:
    
    // client.stream('statuses/filter', {track: trackText}, function (stream) {
    //   console.log('Searching for tweets...');
    //   stream.on('data', handleTweet);
    // });
    
    // Instead, the function can be demonstrated by simulating the stream: iterating
    // over sample tweets. The tweets with a ✅ are the ones which
    // will be matched in the function and be logged to the console:
    
    const sampleTweets = [
      /* ❌ */ {text: `Now available: Buy this product!`},
      /* ✅ */ {text: `This product is availble. Buy now!`},
      /* ✅ */ {text: `look at this meme 🤣`},
      /* ❌ */ {text: `Look at how this meme was created`},
      /* ❌ */ {text: `how's it going everyone? good weather?`},
      /* ✅ */ {text: `Just wondering: How's the weather?`},
      // etc...
    ];
    
    for (const tweet of sampleTweets) {
      handleTweet(tweet);
    }
    
    </script>
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search