Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems crawling buildcanada.com #755

Open
mveytsman opened this issue Feb 5, 2025 · 0 comments
Open

Problems crawling buildcanada.com #755

mveytsman opened this issue Feb 5, 2025 · 0 comments

Comments

@mveytsman
Copy link

I have tried archiving the page with ArchiveWeb.page and it replays successfully within the app/ReplayWeb.page. When trying to crawl it with Browsertrix, the page does not replay succesfully:

Here's a screenshot of what I expect at https://buildcanada.com/memos
Image

Here's what I see in the replay

Image

I also experienced some text problems on a page that does load (https//buildcanada.com):

Image

You can see some of these in QA with the extracted text difference on this crawl

I ran this by @Shrinks99 and he ran a crawl using the beta channel and it delivered improved results but did not complete as successfully as ArchiveWeb.page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Triage
Development

No branches or pull requests

1 participant