skip to Main Content

I have been programming for years, yet I cannot even find a node library that can handle scenarios like this. I want to be able to parse potentially marlformed dates; is my only option really to parse with regex then reconstruct the date for the following examples?:

// invalid date
new Date('Apr 21, 2023,06:51 pm EDT');

// invalid date
new Date('Apr 21, 2023 06:51pm EDT');

// works
new Date('Apr 21, 2023 06:51 pm EDT'); 

// invalid date
new Date('7th Feb 2023'); // error

I have a super complicated regex pulling out the day, month, year, hour, minute, am/pm, and timezone, but is there a simpler approach to this?

2

Answers


  1. Have you tried https://www.npmjs.com/package/any-date-parser ?

    Parse a wide range of date formats including human-input dates.

    any-date-parser has an addFormat() function to add a custom parser.

    Supported formats
    24 hour time
    12 hour time
    timezone offsets
    timezone abbreviations
    year month day
    year monthname day
    month day year
    monthname day year
    day month year
    day monthname year
    +/-/ago periods
    now/today/yesterday/tomorrow
    Twitter

    Login or Signup to reply.
  2. The best would be to have the date formatted in a standard way (ECMAScript’s Date Time String Format) at the source. If not possible, then indeed you’ll need to parse the input (or have a library do that for you).

    I have a super complicated regex …

    Maybe you can build it up in steps, so it stays manageable? Or — if you decided to have the regex perform numerical validations — you could drop those validations and leave that for the Date constructor to deal with.

    Here is a possible implementation with following characteristics:

    • Does not focus on validation; it generously matches numbers that are out of range.
    • Allows any punctuation between the components (anything that matches W+)
    • Uses a lookup object to map known timezone codes to timezone offsets
    • Produces a string that complies with ECMAScript’s Date Time String Format on the condition that the input components were valid (in range).
    • Leaves it for the caller to pass that date time string to the date constructor or to Date.parse.
    const zones = {aoe:'-12:00',y:'-12:00',nut:'-11:00',sst:'-11:00',x:'-11:00',ckt:'-10:00',hst:'-10:00',taht:'-10:00',w:'-10:00',mart:'-09:30',akst:'-09:00',gamt:'-09:00',hdt:'-09:00',v:'-09:00',akdt:'-08:00',pst:'-08:00',pst:'-08:00',u:'-08:00',mst:'-07:00',pdt:'-07:00',t:'-07:00',cst:'-06:00',east:'-06:00',galt:'-06:00',mdt:'-06:00',s:'-06:00',act:'-05:00',cdt:'-05:00',cist:'-05:00',cot:'-05:00',cst:'-05:00',easst:'-05:00',ect:'-05:00',est:'-05:00',pet:'-05:00',r:'-05:00',amt:'-04:00',ast:'-04:00',bot:'-04:00',cdt:'-04:00',cidst:'-04:00',clt:'-04:00',edt:'-04:00',fkt:'-04:00',gyt:'-04:00',pyt:'-04:00',q:'-04:00',vet:'-04:00',nst:'-03:30',adt:'-03:00',amst:'-03:00',art:'-03:00',brt:'-03:00',clst:'-03:00',fkst:'-03:00',gft:'-03:00',p:'-03:00',pmst:'-03:00',pyst:'-03:00',rott:'-03:00',srt:'-03:00',uyt:'-03:00',warst:'-03:00',wgt:'-03:00',ndt:'-02:30',brst:'-02:00',fnt:'-02:00',gst:'-02:00',o:'-02:00',pmdt:'-02:00',uyst:'-02:00',wgst:'-02:00',azot:'-01:00',cvt:'-01:00',egt:'-01:00',n:'-01:00',
        utc:'+00:00',azost:'+00:00',egst:'+00:00',gmt:'+00:00',wet:'+00:00',wt:'+00:00',z:'+00:00',a:'+01:00',bst:'+01:00',cet:'+01:00',ist:'+01:00',wat:'+01:00',west:'+01:00',wst:'+01:00',b:'+02:00',cat:'+02:00',cest:'+02:00',eet:'+02:00',ist:'+02:00',sast:'+02:00',wast:'+02:00',ast:'+03:00',c:'+03:00',eat:'+03:00',eest:'+03:00',fet:'+03:00',idt:'+03:00',msk:'+03:00',syot:'+03:00',trt:'+03:00',irst:'+03:30',adt:'+04:00',amt:'+04:00',azt:'+04:00',d:'+04:00',get:'+04:00',gst:'+04:00',kuyt:'+04:00',msd:'+04:00',mut:'+04:00',ret:'+04:00',samt:'+04:00',sct:'+04:00',aft:'+04:30',irdt:'+04:30',amst:'+05:00',aqtt:'+05:00',azst:'+05:00',e:'+05:00',mawt:'+05:00',mvt:'+05:00',orat:'+05:00',pkt:'+05:00',tft:'+05:00',tjt:'+05:00',tmt:'+05:00',uzt:'+05:00',yekt:'+05:00',ist:'+05:30',npt:'+05:45',
        almt:'+06:00',bst:'+06:00',btt:'+06:00',f:'+06:00',iot:'+06:00',kgt:'+06:00',omst:'+06:00',qyzt:'+06:00',vost:'+06:00',yekst:'+06:00',cct:'+06:30',mmt:'+06:30',cxt:'+07:00',davt:'+07:00',g:'+07:00',hovt:'+07:00',ict:'+07:00',krat:'+07:00',novst:'+07:00',novt:'+07:00',omsst:'+07:00',wib:'+07:00',awst:'+08:00',bnt:'+08:00',cast:'+08:00',chot:'+08:00',cst:'+08:00',h:'+08:00',hkt:'+08:00',hovst:'+08:00',irkt:'+08:00',krast:'+08:00',myt:'+08:00',pht:'+08:00',sgt:'+08:00',ulat:'+08:00',wita:'+08:00',pyt:'+08:30',acwst:'+08:45',awdt:'+09:00',chost:'+09:00',i:'+09:00',irkst:'+09:00',jst:'+09:00',kst:'+09:00',pwt:'+09:00',tlt:'+09:00',ulast:'+09:00',wit:'+09:00',yakt:'+09:00',acst:'+09:30',aest:'+10:00',chut:'+10:00',chst:'+10:00',ddut:'+10:00',k:'+10:00',pgt:'+10:00',vlat:'+10:00',yakst:'+10:00',yapt:'+10:00',acdt:'+10:30',lhst:'+10:30',aedt:'+11:00',bst:'+11:00',kost:'+11:00',l:'+11:00',lhdt:'+11:00',magt:'+11:00',nct:'+11:00',nft:'+11:00',pont:'+11:00',sakt:'+11:00',sbt:'+11:00',sret:'+11:00',vlast:'+11:00',vut:'+11:00',anast:'+12:00',anat:'+12:00',fjt:'+12:00',gilt:'+12:00',m:'+12:00',magst:'+12:00',mht:'+12:00',nfdt:'+12:00',nrt:'+12:00',nzst:'+12:00',petst:'+12:00',pett:'+12:00',tvt:'+12:00',wakt:'+12:00',wft:'+12:00',chast:'+12:45',fjst:'+13:00',nzdt:'+13:00',phot:'+13:00',tkt:'+13:00',tot:'+13:00',wst:'+13:00',chadt:'+13:45',lint:'+14:00',tost:'+14:00'};
    const monthRe = "(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)";
    const dayRe = "(\d\d?)[stnrdh]{0,2}";
    const dateRe = `(?:${monthRe}\W+${dayRe}|${dayRe}\W+${monthRe})\W+(\d{4})`;
    const timeRe = "(\d\d?):(\d\d)(?:\W*([ap])\.?m)?(?:\W*([a-z]{1,5}))?";
    const regex = RegExp(`^\W*${dateRe}(?:\W+${timeRe})?\W*$`, "i");
    
    function toDateTimeStringFormat(s) {
        const match = s.toLowerCase().match(regex);
        if (!match) return;
        let [, m1, day, d2, m2, year, hour, minute, pm, zone] = match;
        const month = 1 + (monthRe.indexOf(m1 ?? m2) >> 2); // month name to number
        day ??= d2;
        zone = zones[zone] ?? ""; // timezone code to offset
        hour ??= "0";
        minute ??= "0";
        if (pm) hour = String((+hour % 12) + 12 * (pm == "p")); // to 24h range
        return `${year}-${month}-${day}T${hour}:${minute}${zone}`
               .replace(/(?<!d)d(?!d)/g, "0$&"); // pad single digit numbers
    }
    
    const tests = [
        'Apr 21, 2023,06:51 pm EDT',
        'Apr 21, 2023 06:51pm EDT',
        'Apr 21, 2023 06:51 pm EDT',
        '7th Feb 2023',
        'Dec 6th, 2022 19:56 CET',
        'Jan 3rd, 2021; 0:04(IST)',
    ];
    
    for (const test of tests) console.log(toDateTimeStringFormat(test));

    Of course, this is limited, and would need extension if more input formats need to be supported.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search