user@argobox:~/journal/2023-12-26-the-dataview-that-pulled-everything
$ cat entry.md

The Dataview That Pulled Everything

○ NOT REVIEWED

The Dataview That Pulled Everything

Date: December 26, 2023 Messages: 52 Issue: Extract text from subheadings across journal entries Result: A query that aggregates content dynamically


The Goal

I wanted to pull specific content from my journal entries. Not the whole entry - just the text under a specific heading called “Person.”

Structure of journal entries:

Journaling/
├── 2023/
│   ├── 12/
│   │   ├── 2023-12-01.md
│   │   ├── 2023-12-02.md
│   │   └── ...

Each entry had sections:

## Work

...

## Person

Thoughts about someone specific today.

## Reflections

...

I wanted a page that showed all “Person” sections, chronologically, across all entries.


The First Attempts

Attempt 1: Basic LIST

LIST
FROM "Journaling"

Returns filenames. Not content. Useless.

Attempt 2: TABLE with file content

TABLE file.content
FROM "Journaling"

Returns entire files. Way too much.

Attempt 3: Searching for heading

LIST
FROM "Journaling"
WHERE contains(file.content, "## Person")

Returns files that contain the heading. Still not the content itself.


The Problem

Dataview doesn’t natively extract text under a specific heading. It can see frontmatter fields. It can see the whole file. But “give me just the content under ## Person” isn’t built in.

The solution: DataviewJS.


The Working Query

const pages = dv.pages('"Journaling"')
  .sort(p => p.file.name, 'asc');

for (let page of pages) {
  const content = await dv.io.load(page.file.path);
  const personSection = extractSection(content, "Person");

  if (personSection) {
    dv.header(3, page.file.link);
    dv.paragraph(personSection);
    dv.paragraph("---");
  }
}

function extractSection(content, heading) {
  const regex = new RegExp(`## ${heading}\\n([\\s\\S]*?)(?=\\n## |$)`, 'i');
  const match = content.match(regex);
  return match ? match[1].trim() : null;
}

How It Works

  1. Get all pages from the Journaling folder
  2. Sort by filename (which is the date)
  3. Load each file’s content as text
  4. Extract the section between ”## Person” and the next heading
  5. Display it with the file link as a header

The regex does the heavy lifting:

  • ## ${heading}\\n - Match the heading
  • ([\\s\\S]*?) - Capture everything (including newlines)
  • (?=\\n## |$) - Until the next heading or end of file

The 52-Message Journey

Message 1-10: “How do I get content from a heading?” Message 11-20: “Dataview doesn’t do that natively” Message 21-30: “Try DataviewJS” Message 31-40: “The regex isn’t matching” Message 41-50: “It matches but includes the next heading” Message 51: “Add a non-greedy quantifier” Message 52: “IT WORKS”


The Result

A single page that shows every “Person” section I’ve ever written, in order. Scroll through months of thoughts. See patterns. Remember moments.

The query updates automatically. Write a new journal entry with a “Person” section, and it appears in the aggregation.


Variations

Pull “Technical” sections:

// Same code, change "Person" to "Technical"
const personSection = extractSection(content, "Technical");

Filter by date range:

const pages = dv.pages('"Journaling/2023/12"')

Only entries with the section:

if (personSection && personSection.length > 10) {
  // Only show if there's actual content
}

What I Learned

Dataview is declarative. DataviewJS is imperative. When the declarative approach fails, JavaScript can do anything.

Regex for content extraction. The [\s\S]*? pattern matches across newlines. Critical for multi-line sections.

File loading is async. The await dv.io.load() is necessary. Forget it and nothing works.

52 messages isn’t failure. It’s learning. Each wrong attempt taught something about how Dataview works.


The Bigger Picture

This query was about more than extracting text. It was about:

  • Aggregating scattered thoughts into patterns
  • Seeing how feelings about someone evolved over time
  • Building a second brain that can answer complex questions

“What have I written about [person] over the last year?” is now a one-click answer.


When the built-in tools don’t do what you need, you write code. That’s what the JS variant is for.