You're researching a topic and want the canonical one-paragraph definition — the lede that Wikipedia editors fight over for months. Manually clicking through 20 articles to copy each intro takes longer than the research itself. This template grabs the title and the first body paragraph from any Wikipedia URL, in any language edition, and logs them so you can paste a clean batch into your notes. Useful for academic literature reviews, building a glossary, or feeding LLM context windows with vetted definitions.
How this workflow works
Six blocks in a straight chain. The structure is deliberately minimal so it stays robust across Wikipedia's 300+ language editions, which all share the same DOM structure.
manual_trigger— Sidepanel Run button. Exposes one input calledarticleof typeurl, defaulting to the Service Worker article. Override per run by typing or pasting any other Wikipedia URL.navigate— Opens{{vars.input.article}}. BecausetargetTab: "new", this happens in a fresh tab.wait_for— Waits for the#firstHeadingelement to render. That's the ID Wikipedia uses for the article title, consistent across desktop and mobile skins, and across all language editions.get_text(first one) — Reads the text of#firstHeading. Captured as$('Capture article title').text.get_text(second one) — Reads the first body paragraph using the selector#mw-content-text .mw-parser-output > p:not(.mw-empty-elt). The:not(.mw-empty-elt)filter is critical: Wikipedia includes invisible spacer paragraphs at the top of some articles, and without that filter you'd get an empty string.matchFirst: truemakes sure only the first matching paragraph is captured.log_data— Concatenates"{{title}} — {{first paragraph}}"and writes it to the workflow log with the labelwikipedia. You'll see this in the sidepanel Run history.
The whole flow runs in 1-2 seconds against a reasonable connection.
Customising it for your case
The template is a starting point — a few directions you can take it.
- Capture more sections. Add another
get_textblock targeting#tocto grab the table of contents, or.infoboxto grab the data sidebar (Wikipedia's coloured box with key facts on the right side of an article). - Switch language editions. No change needed. The default URL points to
en.wikipedia.org, but pastingde.wikipedia.org/wiki/Service-Workerorja.wikipedia.org/wiki/...works without selector changes — the#firstHeadingand.mw-parser-outputselectors are global. - Save instead of log. Replace the final
log_datablock withsave_file. Setsourceto the same concatenated string andfilenameto"{{$('Capture article title').text}}.txt"— you get one text file per article.
Common gotchas
Wikipedia's article DOM is extremely stable, but a few edges trip people up. Disambiguation pages (e.g. /wiki/Apple) have a different structure — the first paragraph is usually a one-line "Apple may refer to..." note instead of a real definition. Articles marked as stubs may have only one short sentence. And on mobile redirects (en.m.wikipedia.org), the wait_for still works but performance is slower because the mobile skin loads more JavaScript. If a captured paragraph is empty, your article probably has a .hatnote element pushing the real lede down — adjust the selector to > p:not(.mw-empty-elt):not(.hatnote).
FAQ
Do I need an API key? No. Wikipedia is fully open. There's an official API at en.wikipedia.org/api/rest_v1/page/summary/<title> that returns the same intro paragraph as JSON if you'd rather skip the DOM scrape — replace this template's navigate + get_text blocks with one http_request block.
Does it work on Wikimedia Commons or Wiktionary? Partially. They share some templates but the title element ID differs — Wiktionary uses #firstHeading too, Commons does not. Test before relying on it.
How is this different from copy-paste? You can run it in a loop. Chain it with a loop block iterating over an array of article URLs and you get a 100-article extract in under a minute. Automa and Browserflow both have equivalent loop blocks; the selectors port directly.
Will it break on a Wikipedia redesign? They tested a new skin (Vector 2022) without breaking these selectors. The IDs #firstHeading and #mw-content-text are part of MediaWiki's core HTML output, not theme-specific.