skip to Main Content

how to get numeric value from string?

From here I see how to pull numbers out of a string. But I also need to pull the information that comes after it until the next number. I have a lot of text that is similar to this and I need to pull out every single time stamp. This text is pulled from a youtube API.

Information Technology- Lecture #1
June 4, 2015
Professor Vasarhelyi
Please visit our website at http://raw.rutgers.edu
Time Stamps:

00:00:28 What is ASEC?
00:02:59 Continuous Monitoring & Continuous Accounting
00:03:43 Assurance
00:07:25 Predictive v. Preventive (Traditional Audit)
00:10:36 Audit Data Standard (ADS)
00:16:37 XBRL and XML
00:20:13 How is technology changing our brains?
00:21:36 Singularity: Artificial Intelligence vs. Human Intelligence
00:37:57 Big Data
00:40:39 NSA Snooping
00:47:59 Internet Trends
00:59:58 E-Education: What will change?
01:08:42 What do you need to know in the age of Google?
01:13:45 Delivery, Assessment, and Granting
01:17:00 Automatic Student Learning Management System
01:20:49 A Degree’s Role in Society
01:23:02 Summary
01:28:52 Primary Priorities for Maintaining Relevance 
01:30:01 GAAP
Summary:
In this lecture, Professor Vasarhelyi introduces what the course will talk about in future sessions while reviewing key and basic concepts with the class.  He also discusses how the Internet changes the way that we think and whether or not robots will soon replace humans in the work force.
Please subscribe to our channel to get the latest updates on the RU Digital Library.

My current method is hitting limitations so I was wondering if it was possible to use that other method in order to pull out only this information:

00:00:28 What is ASEC?
00:02:59 Continuous Monitoring & Continuous Accounting
00:03:43 Assurance
00:07:25 Predictive v. Preventive (Traditional Audit)
00:10:36 Audit Data Standard (ADS)
00:16:37 XBRL and XML
00:20:13 How is technology changing our brains?
00:21:36 Singularity: Artificial Intelligence vs. Human Intelligence
00:37:57 Big Data
00:40:39 NSA Snooping
00:47:59 Internet Trends
00:59:58 E-Education: What will change?
01:08:42 What do you need to know in the age of Google?
01:13:45 Delivery, Assessment, and Granting
01:17:00 Automatic Student Learning Management System
01:20:49 A Degree’s Role in Society
01:23:02 Summary
01:28:52 Primary Priorities for Maintaining Relevance 
01:30:01 GAAP

I would also need to put a <span> tag in front with the closing tag at the end of each time stamp. So expected output:

<span>00:00:28 What is ASEC?</span>
<span>00:02:59 Continuous Monitoring & Continuous Accounting</span>
<span>00:03:43 Assurance</span>
<span>00:07:25 Predictive v. Preventive (Traditional Audit)</span>
<span>00:10:36 Audit Data Standard (ADS)</span>
<span>00:16:37 XBRL and XML</span>
<span>00:20:13 How is technology changing our brains?</span>
<span>00:21:36 Singularity: Artificial Intelligence vs. Human Intelligence</span>
<span>00:37:57 Big Data</span>
<span>00:40:39 NSA Snooping</span>
<span>00:47:59 Internet Trends</span>
<span>00:59:58 E-Education: What will change?</span>
<span>01:08:42 What do you need to know in the age of Google?</span>
<span>01:13:45 Delivery, Assessment, and Granting</span>
<span>01:17:00 Automatic Student Learning Management System</span>
<span>01:20:49 A Degree’s Role in Society</span>
<span>01:23:02 Summary</span>
<span>01:28:52 Primary Priorities for Maintaining Relevance</span>
<span>01:30:01 GAAP</span>

2

Answers


  1. How’s this for some pseudo-ish code:

    lines = <your text as an array of strings>
    events = []
    for (var i = 0; i < lines.length; i++) {
        line = lines[i]
        timestamp = line.split(" ")[0] // get everything before the first space
        description = line.substring(timestamp.length+1) // get everything after the first space
        event = {
            "timestamp": timestamp,
            "description": description
        };
        events.push(event);
    }
    

    This should fill the array events with objects that have the timestamp as a string (you said you know how to convert strings to numbers so I’ll let you take it from there) and the description as another string. Once you have that array, it should be easy to generate a bulleted list or just about any other HTML you want to use to display it; just make another for loop to generate the HTML markup. Does this solve your problem sufficiently?

    Login or Signup to reply.
  2. Here’s another method using regex and String.match. Define one function to extract the timestamp lines from the text, and one to output them. The regex passed to the first function reads: /nd.*(?=n)/g, which says: find every new line with a digit as first character, and followed by another newline, globally. See the snippet below for a demo.

    Note: If you could also get the date on the second line (June 4, 2015), you could even add a date property to your objects, and construct a Javascript date (which is convertible to unicode timestamps amongst others) by simply doing result[i].date = new Date('June 4, 2015' + ' ' + result[i].time) in the findTimestamps function.

    var text = document.getElementsByTagName('p')[0].textContent;
    
    function findTimestamps(regex, target) {
      var result = target.match(regex);
      for (var i = 0; i < result.length; i++) {
        result[i] = { 
          time: result[i].slice(1, result[i].indexOf(' ')),
          msg: result[i].slice(result[i].indexOf(' ') + 1)
        };
      }
      return result;
    }
    function outputTimestamps(target, array) {
      var output = '';
      for (var i = 0; i < array.length; i++) {
        output += '<p><span>' + array[i].time + '</span>' + array[i].msg + '</p>';
      }
      target.innerHTML = output;
    }
    
    var r = findTimestamps(/nd.*(?=n)/g, text);
    outputTimestamps(document.getElementsByTagName('div')[0], r);
    body>p { display: none; }
    div:last-child { white-space: pre; }
    span { margin-right: 20px; }
    <p>Information Technology- Lecture #1
    June 4, 2015
    Professor Vasarhelyi
    Please visit our website at http://raw.rutgers.edu
    Time Stamps:
    00:00:28 What is ASEC?
    00:02:59 Continuous Monitoring & Continuous Accounting
    00:03:43 Assurance
    00:07:25 Predictive v. Preventive (Traditional Audit)
    00:10:36 Audit Data Standard (ADS)
    00:16:37 XBRL and XML
    00:20:13 How is technology changing our brains?
    00:21:36 Singularity: Artificial Intelligence vs. Human Intelligence
    00:37:57 Big Data
    00:40:39 NSA Snooping
    00:47:59 Internet Trends
    00:59:58 E-Education: What will change?
    01:08:42 What do you need to know in the age of Google?
    01:13:45 Delivery, Assessment, and Granting
    01:17:00 Automatic Student Learning Management System
    01:20:49 A Degree’s Role in Society
    01:23:02 Summary
    01:28:52 Primary Priorities for Maintaining Relevance 
    01:30:01 GAAP
    Summary:
    In this lecture, Professor Vasarhelyi introduces what the course will talk about in future sessions while reviewing key and basic concepts with the class.  He also discusses how the Internet changes the way that we think and whether or not robots will soon replace humans in the work force.
    Please subsc</p>
    <div></div>
    <div></div>
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search