skip to Main Content

I would like to convert a list of H1 through H6 tags from a markdown file into a Javascript hierarchy for use in a Table of Contents.

Currently the list is generated by AstroJS in this format [{depth: 1, text: 'I am a H1'}, {depth: 2: 'I am a H2}].

Caveats

  • The markdown is created by end-users.
  • This list may have a single root heading (H1 -> H2 -> H3), but
  • It may have multiple root headings (H2 -> H3, H2 -> H3) or
  • It may have a non conventional list of headings (H3, H2 -> H3)
  • It may skip nesting levels (H1 -> H3 -> H6)

Looking for a Javascript or Typescript example.

The following three scenarios are based on some Markdown content that is being processed by an AstroJS website.

Single root heading

This standard SEO friendly set of headings has a single H1 followed by other headings

As markdown

# Main heading
## Sub heading 1
### More info
## Sub heading 2
### Even more info
## Sub heading 3 (edge case)
##### Deep nesting

As flat javascript array

headings = [
  { depth: 1, text: 'Main heading' },
  { depth: 2, text: 'Sub heading 1' },
  { depth: 3, text: 'More info' },
  { depth: 2, text: 'Sub heading 2' },
  { depth: 3, text: 'Even more info' },
  { depth: 2, text: 'Sub heading 3 (edge case)' },
  { depth: 6, text: 'Deep nesting' },
]

As javascript hierarchy

list_of_heading_heirachies = [
  { text: 'Main heading', headings: [
    { text: 'Sub heading 1', headings: [
      { text: 'More info', headings: [] },
    ] },
    { text: 'Sub heading 2', headings: [
      { text: 'Even more info', headings: [] },
    ] },
    { text: 'Sub heading 3 (edge case)', headings: [
      { text: 'Deep nesting', headings: [] },
    ] }
  ]}
]

console.log(list_of_heading_heirachies.length);
// => 1

Multiple root headings

This markdown (common to listicle pages) does not have a single root node like above, instead it has multiple H2s

As markdown

## Why is it done
### Why abc
### Why xyz
## How is it done
### How reason 1
### How reason 2
#### More info
## Conclusion

As flat javascript array

headings = [
  { depth: 2, 'Why is it done' },
  { depth: 3, 'Why abc' },
  { depth: 3, 'Why xyz' },
  { depth: 2, 'How is it done' },
  { depth: 3, 'How reason 1' },
  { depth: 3, 'How reason 2' },
  { depth: 4, 'More info' },
  { depth: 2, 'Conclusion' }
]

As javascript hierarchy

list_of_heading_heirachies = [
  { text: 'Why is it done', headings: [
    { text: 'Why abc', headings: [] },
    { text: 'Why xyz', headings: [] },
  ] },
  { text: 'How is it done', headings: [
    { text: 'How reason 1', headings: [] },
    { text: 'How reason 2', headings: [
      { text: 'More info', headings: [] },
    ] },
  ] },
  { text: 'Conclusion', headings: [] }
]

console.log(list_of_heading_heirachies.length);
// => 3

Non-conventional headings list

This non-conventional headings list happens when there is meta data or breadcrumb data before the general content headings

#### Home -> Blog -> Some Articles
### By Ben Hurr
#### 24th, Sep, 2022
# Some cool Article
## Why abc
### info on why
### more info on why
## How
### How we did it
## Conclusion

As flat javascript array

headings = [
  { depth: 4, text: 'Home -> Blog -> Some Articles' },
  { depth: 3, text: 'By Ben Hurr' },
  { depth: 4, text: '24th, Sep, 2022' },
  { depth: 1, text: 'Some cool Article' },
  { depth: 2, text: 'Why abc' },
  { depth: 3, text: 'info on why' },
  { depth: 3, text: 'more info on why' },
  { depth: 2, text: 'How' },
  { depth: 3, text: 'How we did it' },
  { depth: 2, text: 'Conclusion' },
]

As javascript hierarchy

list_of_heading_heirachies = [
  { text: 'Home -> Blog -> Some Articles', headings: [] },
  { text: 'By Ben Hurr', headings: [
    { text: '24th, Sep, 2022', headings: [] },
  ] },
  { text: 'Some cool Article', headings: [
    { text: 'Why abc', headings: [
      { text: 'info on why', headings: [] },
      { text: 'more info on why', headings: [] },
    ] },
    { text: 'How', headings: [
      { text: 'How we did it', headings: [] },
    ] },
    { text: 'Conclusion', headings: [] },
  ] },
]

console.log(list_of_heading_heirachies.length);
// => 3

2

Answers


  1. Chosen as BEST ANSWER

    I was able to solve this problem in Typescript.

    I am not experienced in this language and while it works fine, someone with more experience may be able to improve this code.

    My goal was to take any list of H1..6 and turn it into a Hierarchy and then filter to final Hierarchy to produce a useful Table of Content.

    Output Screen Shot

    enter image description here

    Heading.ts

    export default class Heading {
      text: string;
      slug: string;
      depth: number;
      sequence: number;
    
      parent: Heading | null;
      headings: Heading[] | null;
    
      constructor(text: string, slug: string, depth: number, sequence: number) {
        this.text = text;
        this.slug = slug;
        this.depth = depth;
        this.sequence = sequence;
    
        this.parent = null;
        this.headings = null;
      }
    
      add_heading(heading: Heading) {
        heading.parent = this;
        if (this.headings === null) {
          this.headings = [];
        }
        this.headings.push(heading);
      }
    
      root() {
        return this.depth === 0;
      }
    
      valid() {
        return this.depth > 0;
      }
    
      data() {
        let result: any = {
          text: this.text,
          slug: this.slug,
          depth: this.depth,
          sequence: this.sequence,
          headings: this.headings?.map((h) => h.data()),
        };
        return result;
      }
    }
    

    CreateToc.ts

    import Heading from './heading';
    
    export default class CreateToc {
      toc: Heading;
      current_heading: Heading;
    
      constructor() {
        this.toc = new Heading("*", '-', 0, -1);
        this.current_heading = this.toc;
      }
    
      process(astro_headings: any[]) {
        astro_headings.forEach((astro_heading, index) => {
          let heading = new Heading(
            astro_heading.text,
            astro_heading.slug,
            astro_heading.depth,
            index
          );
          this.process_heading(heading);
        });
        return this.data();
      }
    
      debug() {
        console.log(JSON.stringify(this.data(), null, 2));
      }
    
      private data() {
        return (this.toc.headings || []).map((h) => h.data());
      }
    
      private process_heading(heading: Heading) {
        if (heading.depth > this.current_heading.depth) {
          this.add_child_heading(heading);
        } else if (heading.depth === this.current_heading.depth) {
          this.add_sibling_heading(heading);
        } else if (heading.depth < this.current_heading.depth) {
          this.add_upstream_heading(heading);
        }
      }
    
      private add_child_heading(heading: Heading) {
        this.current_heading.add_heading(heading);
        this.current_heading = heading;
      }
    
      private add_sibling_heading(heading: Heading) {
        if (this.current_heading!.parent === null) {
          return;
        }
        this.current_heading = this.current_heading.parent;
        this.add_child_heading(heading);
      }
    
      private add_upstream_heading(heading: Heading) {
        while (this.current_heading.valid() && heading.depth < this.current_heading.depth) {
          if (this.current_heading.parent === null) {
            throw new Error("current_heading.parent is null");
          }
    
          this.current_heading = this.current_heading.parent;
        }
    
        // There is no upstream heading to attach to, so just add a new root node
        if (this.current_heading.root()) {
          return this.add_child_heading(heading);
        }
    
        this.add_sibling_heading(heading);
      }
    }
    

    Example Usage

    let filterHeadings = headings
      .filter((heading) => heading.depth === 2 || heading.depth === 3)
    
    let tocHeadings = (new CreateToc()).process(filterHeadings);
    

    Sample Data

    [
      {
        "text": "Traits for data variation",
        "slug": "traits-for-data-variation",
        "depth": 2,
        "sequence": 0,
        "headings": [
          {
            "text": "User with role flag",
            "slug": "user-with-role-flag",
            "depth": 3,
            "sequence": 1
          },
          {
            "text": "Blog article in draft mode",
            "slug": "blog-article-in-draft-mode",
            "depth": 3,
            "sequence": 2
          }
        ]
      },
      {
        "text": "Enum Traits (non ActiveRecord)",
        "slug": "enum-traits-non-activerecord",
        "depth": 2,
        "sequence": 3,
        "headings": [
          {
            "text": "#traits_for_enums",
            "slug": "traits_for_enums",
            "depth": 3,
            "sequence": 4
          }
        ]
      },
      {
        "text": "Get a list of available traits",
        "slug": "get-a-list-of-available-traits",
        "depth": 2,
        "sequence": 5,
        "headings": [
          {
            "text": "#defined_traits",
            "slug": "defined_traits",
            "depth": 3,
            "sequence": 6
          }
        ]
      },
      {
        "text": "Global traits",
        "slug": "global-traits",
        "depth": 2,
        "sequence": 7
      }
    ]
    

  2. That seems like a lot of code for the problem.

    I would fold the list of nodes into an array of nodes still available to add children to, sorted by depth. We start with an initial node of depth 0. At the end we can simply extract the children of the first node.

    If we don’t mind keeping depth properties on your node, that’s all it takes. If we really don’t want them, then this can be wrapped in a simple function that recursively removes these properties.

    Here’s one version:

    const removeDepth = (xs) => xs .map (
      ({headings = [], depth, ...rest}) => ({...rest, headings: removeDepth(headings)})
    )
    const makeTree = (xs) => removeDepth (xs .reduce (
      (hs, {depth, text}) => {
        const idx = hs .findLastIndex (x => x .depth < depth)
        const node = {text, headings: [], depth}
        hs [idx] .headings .push (node)
        return [...hs.slice (0, idx + 1), node]
      },
      [{text: null, headings: [], depth: 0}]
    ) [0] .headings)
    
    
    const singleRoot = [{depth: 1, text: 'Main heading'}, {depth: 2, text: 'Sub heading 1'}, {depth: 3, text: 'More info'}, {depth: 2, text: 'Sub heading 2'}, {depth: 3, text: 'Even more info'}, {depth: 2, text: 'Sub heading 3 (edge case)'}, {depth: 6, text: 'Deep nesting'}]
    const multipleRoot = [{depth: 2, text: 'Why is it done'}, {depth: 3, text: 'Why abc'}, {depth: 3, text: 'Why xyz'}, {depth: 2, text: 'How is it done'}, {depth: 3, text: 'How reason 1'}, {depth: 3, text: 'How reason 2'}, {depth: 4, text: 'More info'}, {depth: 2, text: 'Conclusion'}]
    const nonCon = [{depth: 4, text: 'Home -> Blog -> Some Articles'}, {depth: 3, text: 'By Ben Hurr'}, {depth: 4, text: '24th, Sep, 2022'}, {depth: 1, text: 'Some cool Article'}, {depth: 2, text: 'Why abc'}, {depth: 3, text: 'info on why'}, {depth: 3, text: 'more info on why'}, {depth: 2, text: 'How'}, {depth: 3, text: 'How we did it'}, {depth: 2, text: 'Conclusion'}]
    
    console .log ('Single root: ', makeTree (singleRoot))
    console .log ('Multiple root: ', makeTree (multipleRoot))
    console .log ('Non-conventional headings list: ', makeTree (nonCon))
    .as-console-wrapper {max-height: 100% !important; top: 0}

    The real trick is maintaining the current list of open arrays. We maintain it by finding the last node in the list with a lower depth than our current node, adding our current node as a child of that one, cutting the list at that point, and adding our current node to it. Here’s how that array goes as we add node after node from one of your examples:

    //Initial list: 
    [
      {depth: 0, headings: [], text: null}
    ]
    
    // After breadcrumbs (i.e. "Home -> Blog -> Some Articles"): (4)
    [
      {depth: 0, headings: [<breadcrumbs>], text: null}
      {depth: 4, headings: [], text: "Home -> Blog -> Some Articles"}
    ]
    
    // After "By Ben Hurr": (3):
    [
      {depth: 0, headings: [<breadcrumbs>, <By Ben>], text: null}
      {depth: 3, headings: [], text: "By Ben Hurr"}
    ]
    
    
    // After "24th Sept" (4):
    [
      {depth: 0, headings: [<breadcrumbs>, <By Ben>], text: null}
      {depth: 3, headings: [<24th Sept>], text: "By Ben Hurr"}
      {depth: 4, headings: [], text: "24th, Sep, 2022"}
    ]
    
    
    // After "Some Cool Article" (1):
    [
      {depth: 0, headings: [<breadcrumbs>, <By Ben>, <Some Cool>], text: null}
      {depth: 1, headings: [], text: 'Some Cool Article'}
    ]
    
    // After "Why ABC" (2):
    [
      {depth: 0, headings: [<breadcrumbs>, <By Ben>, <Some Cool>], text: null}
      {depth: 1, headings: [<Why abc>], text: 'Some Cool Article'}
      {depth: 2, headings: [], text: 'Why abc'}
    ]
    
    // ...
    
    // Finally
    
    // After "Conclusion" (2):
    [
      {depth: 0, headings: [<breadcrumbs>, <By Ben>, <Some Cool>], text: null}
      {depth: 1, headings: [<Why abc>, <How>, <Conclusion>], text: 'Some Cool Article'}
      {depth: 2, headings: [], text: 'Conclusion'}
    ]
    

    I think this is significantly simpler than your approach, but I must admit to not having read your approach carefully; it simply seemed too much code for an interesting, but not overly complex, problem.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search