Skip to content

Support indexing by an LLM #4545

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 tasks done
BryanHuntNV opened this issue Feb 5, 2025 · 5 comments · Fixed by #4635
Closed
4 tasks done

Support indexing by an LLM #4545

BryanHuntNV opened this issue Feb 5, 2025 · 5 comments · Fixed by #4635
Labels
need more info Further information is requested

Comments

@BryanHuntNV
Copy link

Is your feature request related to a problem? Please describe.

Our documentation needs to be indexed by an LLM to support natural language searching across multiple tools. The dynamic nature of routing between pages makes this challenging.

Describe the solution you'd like

It would be great if VitePress could somehow make indexing available to external LLMs.

Describe alternatives you've considered

The alternative we are considering is using MkDocs with a plugin for publishing the content to Confluence.

Additional context

No response

Validations

@brc-dd
Copy link
Member

brc-dd commented Feb 5, 2025

Can’t tell without knowing what kind of dynamic content you have. But few options:

  • Just index the markdown source.
  • Enable sitemap and scrape each url in cheerio or some headless browser using playwright/puppeteer. Simple fetch will work if most of the content is static.
  • There are many tools and packages that can generate llm friendly structured data, for example https://github.com/egoist/sitefetch

Is there something specific that you need help with?

@brc-dd brc-dd added the need more info Further information is requested label Feb 15, 2025
@okineadev
Copy link
Contributor

Related issue: vitejs/vite#19400

@Barbapapazes
Copy link
Contributor

Barbapapazes commented Mar 13, 2025

Hey 👋,

You can easily achieve this in VitePress with the createContentLoader helper.

Here's my code of .vitepress/generators/llms.ts for my website soubiran.dev:

import type { ContentData, SiteConfig } from 'vitepress'
import { writeFileSync } from 'node:fs'
import path from 'node:path'
import { joinURL, withoutTrailingSlash } from 'ufo'
import { createContentLoader } from 'vitepress'
import { URL } from '../constants'

export async function genLlms(config: SiteConfig) {
  let llms = '# Estéban Soubiran'

  llms += `\n\n`

  llms += await createSection({
    title: 'English Content',
    postsTitle: 'Posts',
    seriesTitle: 'Series',
  }).then(content => content.trim())

  llms += `\n\n`

  llms += await createSection({
    title: 'Contenu en Français',
    postsTitle: 'Articles',
    seriesTitle: 'Séries',
    prefix: 'fr',
  }).then(content => content.trim())

  writeFileSync(path.join(config.outDir, 'llms.txt'), llms.trim())
}

interface CreateSectionOptions {
  title: string
  postsTitle: string
  seriesTitle: string
  prefix?: string
}
async function createSection(options: CreateSectionOptions) {
  let content = `## ${options.title}\n\n`

  const frPosts = await createContentLoader(`${joinURL(options.prefix ?? '', 'posts/**/*.md')}`).load().then(posts => posts.sort((a, b) => b.frontmatter.id - a.frontmatter.id))

  content += `### ${options.postsTitle}\n\n`
  content += `${createList(frPosts).trim()}\n\n`

  const frSeries = await createContentLoader(`${joinURL(options.prefix ?? '', 'series/*.md')}`).load().then(series => series.sort((a, b) => b.frontmatter.id - a.frontmatter.id))

  content += `### ${options.seriesTitle}\n\n`
  for (const seriesItem of frSeries) {
    content += `${createListItem(seriesItem)}\n`

    const seriesPosts = await createContentLoader(`${seriesItem.url}/*.md`).load().then(posts => posts.sort((a, b) => b.frontmatter.id - a.frontmatter.id))

    if (seriesPosts.length === 0) {
      continue
    }

    content += createList(seriesPosts, '  ')
  }

  return content
}

function createList(items: ContentData[], prefix = '') {
  let list = ''
  for (const item of items) {
    list += `${prefix}${createListItem(item)}\n`
  }
  return list
}

function createListItem(item: ContentData) {
  return `- [${item.frontmatter.title}](${withoutTrailingSlash(joinURL(URL, item.url))})`
}

Then, you can use this function in the .vitepress/config.ts

import { defineConfig } from 'vitepress'
import { genLlms } from './generators/llms.js'

export default defineConfig({
  buildEnd: async (config: SiteConfig) => {
    await genLlms(config)
  },
})

@okineadev
Copy link
Contributor

Hi 🙌

I created a prototype of a plugin for VitePress that generates documentation for LLMs - https://github.com/okineadev/vitepress-plugin-llms, it integrates very easily with VitePress and will allow you to easily make documentation for LLMs available on any site based on it

I'm a little sick right now so I'll finish it later, PRs are welcome!

@brc-dd
Copy link
Member

brc-dd commented Apr 11, 2025

For now vitepress-plugin-llms looks like a good enough solution. I'm keeping #4590 open for now though, to discuss whether we should include llms.txt generation in the core. Please track it instead.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 19, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
need more info Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants