The Dataview That Pulled Everything
Date: December 26, 2023 Messages: 52 Issue: Extract text from subheadings across journal entries Result: A query that aggregates content dynamically
The Goal
I wanted to pull specific content from my journal entries. Not the whole entry - just the text under a specific heading called “Person.”
Structure of journal entries:
Journaling/
├── 2023/
│ ├── 12/
│ │ ├── 2023-12-01.md
│ │ ├── 2023-12-02.md
│ │ └── ...
Each entry had sections:
## Work
...
## Person
Thoughts about someone specific today.
## Reflections
...
I wanted a page that showed all “Person” sections, chronologically, across all entries.
The First Attempts
Attempt 1: Basic LIST
LIST
FROM "Journaling"
Returns filenames. Not content. Useless.
Attempt 2: TABLE with file content
TABLE file.content
FROM "Journaling"
Returns entire files. Way too much.
Attempt 3: Searching for heading
LIST
FROM "Journaling"
WHERE contains(file.content, "## Person")
Returns files that contain the heading. Still not the content itself.
The Problem
Dataview doesn’t natively extract text under a specific heading. It can see frontmatter fields. It can see the whole file. But “give me just the content under ## Person” isn’t built in.
The solution: DataviewJS.
The Working Query
const pages = dv.pages('"Journaling"')
.sort(p => p.file.name, 'asc');
for (let page of pages) {
const content = await dv.io.load(page.file.path);
const personSection = extractSection(content, "Person");
if (personSection) {
dv.header(3, page.file.link);
dv.paragraph(personSection);
dv.paragraph("---");
}
}
function extractSection(content, heading) {
const regex = new RegExp(`## ${heading}\\n([\\s\\S]*?)(?=\\n## |$)`, 'i');
const match = content.match(regex);
return match ? match[1].trim() : null;
}
How It Works
- Get all pages from the Journaling folder
- Sort by filename (which is the date)
- Load each file’s content as text
- Extract the section between ”## Person” and the next heading
- Display it with the file link as a header
The regex does the heavy lifting:
## ${heading}\\n- Match the heading([\\s\\S]*?)- Capture everything (including newlines)(?=\\n## |$)- Until the next heading or end of file
The 52-Message Journey
Message 1-10: “How do I get content from a heading?” Message 11-20: “Dataview doesn’t do that natively” Message 21-30: “Try DataviewJS” Message 31-40: “The regex isn’t matching” Message 41-50: “It matches but includes the next heading” Message 51: “Add a non-greedy quantifier” Message 52: “IT WORKS”
The Result
A single page that shows every “Person” section I’ve ever written, in order. Scroll through months of thoughts. See patterns. Remember moments.
The query updates automatically. Write a new journal entry with a “Person” section, and it appears in the aggregation.
Variations
Pull “Technical” sections:
// Same code, change "Person" to "Technical"
const personSection = extractSection(content, "Technical");
Filter by date range:
const pages = dv.pages('"Journaling/2023/12"')
Only entries with the section:
if (personSection && personSection.length > 10) {
// Only show if there's actual content
}
What I Learned
Dataview is declarative. DataviewJS is imperative. When the declarative approach fails, JavaScript can do anything.
Regex for content extraction. The [\s\S]*? pattern matches across newlines. Critical for multi-line sections.
File loading is async. The await dv.io.load() is necessary. Forget it and nothing works.
52 messages isn’t failure. It’s learning. Each wrong attempt taught something about how Dataview works.
The Bigger Picture
This query was about more than extracting text. It was about:
- Aggregating scattered thoughts into patterns
- Seeing how feelings about someone evolved over time
- Building a second brain that can answer complex questions
“What have I written about [person] over the last year?” is now a one-click answer.
When the built-in tools don’t do what you need, you write code. That’s what the JS variant is for.