Skip to content

Add automatic website rebuilding to this repository #601

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: lektor
Choose a base branch
from

Conversation

tekktrik
Copy link

@tekktrik tekktrik commented Mar 6, 2025

PR Checklist:

  • [ x ] All new features have been tested
  • All new features have been documented
  • [ x ] I have read the CONTRIBUTING.md file
  • [ x ] I will abide by the code of conduct

Partially adds the functionality where issues being opened, edited, labeled, closed, and deleted for a given repository triggers the website rebuilding. It works by:

  • Adding a repository dispatch to this repository's publish workflow, which allows it to be triggered by other repositories. The concurrency setting is used to make sure that only one instance is running at anytime, and at most one workflow is pending the completion of the current. This ensures the website is continually updating even under heavy loads from multi-issue updates. If another dispatch comes in when one is pending, the older pending run is cancelled and the new triggering workflow takes its place.
  • Any sending repository must include the new trigger_republish.yml which is specifically is triggered. It sends the repository dispatch event to beeware.github.io. I have added the one this repository would use, but adding this functionality to repositories such as toga and briefcase will be necessary eventually. A secret named REBUILD_TOKEN must be added to the source repository with the repo or public_repo scope

This is difficult to test until it's implemented but I use this exact method myself to make sure a repository for storing software code triggers a pull request in the overall project repostitory where it is a submodule to update it.

This is the first step to creating an auto-updating issues page for the website!

@tekktrik tekktrik changed the title Dev/add website auto rebuild Add automatic website rebuilding to this repository Mar 6, 2025
Copy link
Member

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand the approach you're taking here... but I'm understanding it correctly, I have some major reservations.

As I understand it, you're proposing that new, edited or closed issue on a BeeWare repository would potentially trigger a republish of the entire BeeWare website. We would be able to control which repositories installed this trigger; but presumably the trigger would be installed on all the high profile project repos at the very least.

That strikes me as an approach that will lead to a lot of unnecessary complete website rebuilds. Github Actions CPU time isn't an unlimited resource - so we need to be careful about triggering unnecessary work.

If this is being done in service of maintaining a list of active "good first issue" issues - are there any alternative approaches worth exploring? Off the top of my head:

  • Only triggering on a subset of actions - most notably, adding/removing a label, and closing a ticket? Just opening an issue doesn't mean there's a new first-timer issue.
  • Only triggering a subset of the build - if we only need to update a single page - can we just update that one page?
  • Using an approach other than a website rebuild - for example, using a JS query to retrieve a live list of issues matching a query?

@tekktrik
Copy link
Author

Sorry for the delayed response, I've been traveling.

This PR was created with the intention of keeping an updated list of all issues, but things change if the intention is for it to only track "good first issues". I agree in this case that there's no need to burn more GitHub Action runtime than necessary for that. In this case, in response to your suggestions:

  • Only triggering on a subset of actions - This is for sure something that would need to be changed, and I think might be enough to address the runtime concern. I think then that labeled, unlabeled, closed, and deleted is likely enough. Adding reopened and transferred is an additional measure for making sure the list is updated, but likely not required or as common. edited would make sure the title is updated, but the link the page will use wouldn't change so the URL wouldn't be broken at least. opened is overkill in this case.
  • Only triggering a subset of the build - I am unfamiliar with lektor and how the build and deployment process works. I'd be happy to look into if partial builds are possible and using that. Using the full build was definitely just the simplest option, so this is definitely an option if there's a way.
  • Using an approach other than a website rebuild - I don't think a JS query is the best solution, as the rate limit for unauthenticated GitHub API requests is 60 requests/hr, which isn't nothing, but someone continually refreshing the browser might cause them to get rate limited pretty quickly. You could maybe cache the results, but then it might not be fully updated during a sprint (unless you use JS to rate limit calls to once every 1 or 2 minutes). This is why I relied on making sure the updates were server-side.

Additionally, I just want to reiterate that since public repositories aren't usage limited by GitHub Actions, this, in combination with the use of a concurrency group, makes sure only a single instance of the website build workflow is ever running at once, which does prevent things like hitting the maximum number of workflows running in a repository at once, as well as any overlapping (and possibly botched?) website deployments. I am not sure if there is another constraint you were referring to when you said GitHub Actions CPU time is unlimited (or if there are limits if you really push it), but just wanted to make sure I understood any other limitations.

Happy to make sure I understand and make changes accordingly! For reference, the following change should be PR'd shortly, and in its current iteration it just performs the issue querying at build time bakes it as HTML into a new webpage, with filtering on tags other than "good first issue" happening client-side via JS.

@freakboy3742
Copy link
Member

  • Only triggering a subset of the build - I am unfamiliar with lektor and how the build and deployment process works. I'd be happy to look into if partial builds are possible and using that. Using the full build was definitely just the simplest option, so this is definitely an option if there's a way.

I'm not aware of any way to do a single-page build either. I was thinking more of a "bare metal" approach - putting something into the markup of the page so that it can be re-written directly to the published gh-pages - i.e., the full pipeline writes <!-- generated content goes here --> sentinels, and publishes a version of that content; but a "tagging change" update reads the current gh-pages content, rewrites the content between those sentinels, and pushes the update to the gh-pages branch).

  • Using an approach other than a website rebuild - I don't think a JS query is the best solution, as the rate limit for unauthenticated GitHub API requests is 60 requests/hr, which isn't nothing, but someone continually refreshing the browser might cause them to get rate limited pretty quickly. You could maybe cache the results, but then it might not be fully updated during a sprint (unless you use JS to rate limit calls to once every 1 or 2 minutes). This is why I relied on making sure the updates were server-side.

Understood - that's a reasonable argument for why JS won't work.

Additionally, I just want to reiterate that since public repositories aren't usage limited by GitHub Actions, this, in combination with the use of a concurrency group, ...

Oh - sure - we use concurrency groups extensively in our CI configs. However, even with those in place: consider that in a sprint situation, we're already pushing the limit of concurrent CI tasks... and every user finishing a PR will result in a rebuild of the website. A full website rebuild takes about 5 minutes at present; we could easily end up with a situation where we have 1 actions worker tied up full time just rebuilding the website - and that means 1 worker that isn't available to run the 40+ jobs needed to do a CI pass on Toga, or the 50+ jobs needed to test a PR on Briefcase. When most of the website isn't changing, I'm wary of adding to that load just to keep a list of tickets up to date.

Happy to make sure I understand and make changes accordingly! For reference, the following change should be PR'd shortly, and in its current iteration it just performs the issue querying at build time bakes it as HTML into a new webpage, with filtering on tags other than "good first issue" happening client-side via JS.

I appreciate that you've put a bunch of work into this already - and first off, thank you for that work. This is definitely some sophisticated Github Actions-fu, and if nothing else, it's useful to have a "this is the sort of complexity we're talking about" proof of concept to start a discussion.

However, my concern at this point is that it seems that the complexity and overhead required to implement the underlying feature isn't obviously worth it. At the end of the day, we're talking about implementing a version of this query. While it might be nice to have a "cleaned up" version of that query on BeeWare's own page... will that actually improve the experience of users? Will it make it easier to drive users to relevant contributions during a sprint? If it were a simple drop-in Javascript thing, or something that was a trivial one page-addition to the CMS... maybe; but if it's going to need a bunch of complex CI triggers across multiple projects, and require a complete website rebuild every time someone tags a PR... I'm not so sure.

If we're able to crack the "build runtime" issue so that this is a 5 second task rather than 5 minute task, that might change the math.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants