Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load new page, browser needs restart #766

Closed
zlodejpapiru opened this issue Feb 11, 2025 · 6 comments · Fixed by #773
Closed

Unable to load new page, browser needs restart #766

zlodejpapiru opened this issue Feb 11, 2025 · 6 comments · Fixed by #773

Comments

@zlodejpapiru
Copy link

zlodejpapiru commented Feb 11, 2025

Hi, since the 1.5 version crawler keep crashing. Save state is saved, warcs not generated. Here is excerpt form log (1.5.3 version) + my config.

workers: 2
limit: 200000
generateWACZ: true
behaviors:

  • autoscroll
  • siteSpecific
  • autoplay
  • autofetch
    #behaviorTimeout: 0
    #timeout: 36000
    saveState: always
    warcinfo:
    operator: xxxx
    generateCdx: true
    #pageLoadTimeout: 90
    scopeType: page-spa
    extraHops: 50000
    #waitUntil: true
    #sitemap: true

{"timestamp":"2025-02-11T10:44:37.364Z","logLevel":"warn","context":"recorder","message":"Skipping URL from unknown frame","details":{"url":"https://twitter.com/zenisek_m","frameId":"EAD6CFF294DEABBEC9EA03D736541A43"}} {"timestamp":"2025-02-11T10:44:37.594Z","logLevel":"warn","context":"recorder","message":"Skipping URL from unknown frame","details":{"url":"https://x.com/zenisek_m","frameId":"EAD6CFF294DEABBEC9EA03D736541A43"}} {"timestamp":"2025-02-11T10:44:37.899Z","logLevel":"warn","context":"recorder","message":"Skipping URL from unknown frame","details":{"url":"https://abs.twimg.com/responsive-web/client-web/vendor.c4b9145a.js","frameId":"EAD6CFF294DEABBEC9EA03D736541A43"}} {"timestamp":"2025-02-11T10:44:37.903Z","logLevel":"warn","context":"recorder","message":"Skipping URL from unknown frame","details":{"url":"https://abs.twimg.com/responsive-web/client-web/i18n/en.08723eda.js","frameId":"EAD6CFF294DEABBEC9EA03D736541A43"}} {"timestamp":"2025-02-11T10:44:37.905Z","logLevel":"warn","context":"recorder","message":"Skipping URL from unknown frame","details":{"url":"https://abs.twimg.com/responsive-web/client-web/main.2b2a6b7a.js","frameId":"EAD6CFF294DEABBEC9EA03D736541A43"}} {"timestamp":"2025-02-11T10:44:40.593Z","logLevel":"info","context":"general","message":"Seed page redirected, adding redirected seed","details":{"origUrl":"https://twitter.com/zcrkvenjas","newUrl":"https://x.com/zcrkvenjas","seedId":3051}} {"timestamp":"2025-02-11T10:44:41.982Z","logLevel":"info","context":"behavior","message":"Running behaviors","details":{"frames":1,"frameUrls":["https://x.com/zcrkvenjas"],"page":"https://twitter.com/zcrkvenjas","workerid":0}} {"timestamp":"2025-02-11T10:44:41.983Z","logLevel":"info","context":"behavior","message":"Run Script Started","details":{"frameUrl":"https://x.com/zcrkvenjas","page":"https://twitter.com/zcrkvenjas","workerid":0}} {"timestamp":"2025-02-11T10:44:42.025Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":0,"images":0,"videos":0,"threads":1},"msg":"Capturing thread: https://x.com/zcrkvenjas","page":"https://x.com/zcrkvenjas","workerid":0}} {"timestamp":"2025-02-11T10:44:43.154Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":1,"images":0,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/ZCrkvenjas/status/1877760280739958931","page":"https://x.com/ZCrkvenjas/status/1877760280739958931","workerid":0}} {"timestamp":"2025-02-11T10:44:46.890Z","logLevel":"info","context":"general","message":"Seed page redirected, adding redirected seed","details":{"origUrl":"https://twitter.com/zenisek_m","newUrl":"https://x.com/zenisek_m","seedId":3052}} {"timestamp":"2025-02-11T10:44:47.861Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":2,"images":0,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/ZCrkvenjas/status/1871278889764294806","page":"https://x.com/ZCrkvenjas/status/1871278889764294806","workerid":0}} {"timestamp":"2025-02-11T10:44:48.070Z","logLevel":"info","context":"behavior","message":"Running behaviors","details":{"frames":1,"frameUrls":["https://x.com/zenisek_m"],"page":"https://twitter.com/zenisek_m","workerid":1}} {"timestamp":"2025-02-11T10:44:48.070Z","logLevel":"info","context":"behavior","message":"Run Script Started","details":{"frameUrl":"https://x.com/zenisek_m","page":"https://twitter.com/zenisek_m","workerid":1}} {"timestamp":"2025-02-11T10:44:48.111Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":0,"images":0,"videos":0,"threads":1},"msg":"Capturing thread: https://x.com/zenisek_m","page":"https://x.com/zenisek_m","workerid":1}} {"timestamp":"2025-02-11T10:44:51.906Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":3,"images":0,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/ZCrkvenjas/status/1856696923924668452","page":"https://x.com/ZCrkvenjas/status/1856696923924668452","workerid":0}} {"timestamp":"2025-02-11T10:44:55.487Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":4,"images":0,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/ZCrkvenjas/status/1844010288305373344","page":"https://x.com/ZCrkvenjas/status/1844010288305373344","workerid":0}} {"timestamp":"2025-02-11T10:44:59.374Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":5,"images":0,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/ZCrkvenjas/status/1842180388984094815","page":"https://x.com/ZCrkvenjas/status/1842180388984094815","workerid":0}} {"timestamp":"2025-02-11T10:45:09.070Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":0,"images":1,"videos":0,"threads":1},"msg":"Loading Image: https://x.com/zenisek_m/status/1889258330859753765/photo/1","page":"https://x.com/zenisek_m/status/1889258330859753765/photo/1","workerid":1}} {"timestamp":"2025-02-11T10:45:10.674Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":1,"images":1,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/zenisek_m/status/1889258330859753765","page":"https://x.com/zenisek_m/status/1889258330859753765","workerid":1}} {"timestamp":"2025-02-11T10:45:14.181Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":1,"images":2,"videos":0,"threads":1},"msg":"Loading Image: https://x.com/zenisek_m/status/1888944992963436970/photo/1","page":"https://x.com/zenisek_m/status/1888944992963436970/photo/1","workerid":1}} {"timestamp":"2025-02-11T10:45:15.825Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":2,"images":2,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/NLIsrael/status/1888951514472853827","page":"https://x.com/NLIsrael/status/1888951514472853827","workerid":1}} {"timestamp":"2025-02-11T10:45:19.241Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":3,"images":2,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/zenisek_m/status/1889029112074752253","page":"https://x.com/zenisek_m/status/1889029112074752253","workerid":1}} {"timestamp":"2025-02-11T10:45:23.079Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":3,"images":3,"videos":0,"threads":1},"msg":"Loading Image: https://x.com/zenisek_m/status/1888944992963436970/photo/1","page":"https://x.com/zenisek_m/status/1888944992963436970/photo/1","workerid":1}} {"timestamp":"2025-02-11T10:45:24.655Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":4,"images":3,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/zenisek_m/status/1888944992963436970","page":"https://x.com/zenisek_m/status/1888944992963436970","workerid":1}} {"timestamp":"2025-02-11T10:45:26.049Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":5,"images":1,"videos":0,"threads":1},"msg":"Loading Image: https://x.com/ZCrkvenjas/status/1839256302305816620/photo/1","page":"https://x.com/ZCrkvenjas/status/1839256302305816620/photo/1","workerid":0}} {"timestamp":"2025-02-11T10:45:27.983Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":6,"images":1,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/ZCrkvenjas/status/1839256302305816620","page":"https://x.com/ZCrkvenjas/status/1839256302305816620","workerid":0}} {"timestamp":"2025-02-11T10:45:28.348Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":4,"images":4,"videos":0,"threads":1},"msg":"Loading Image: https://x.com/zenisek_m/status/1888557687320752160/photo/1","page":"https://x.com/zenisek_m/status/1888557687320752160/photo/1","workerid":1}} {"timestamp":"2025-02-11T10:45:30.238Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":5,"images":4,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/zenisek_m/status/1888557687320752160","page":"https://x.com/zenisek_m/status/1888557687320752160","workerid":1}} {"timestamp":"2025-02-11T10:45:33.030Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":7,"images":1,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/ZCrkvenjas/status/1786412081643696130","page":"https://x.com/ZCrkvenjas/status/1786412081643696130","workerid":0}} {"timestamp":"2025-02-11T10:45:37.681Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":5,"images":5,"videos":0,"threads":1},"msg":"Loading Image: https://x.com/zenisek_m/status/1888533728399417411/photo/1","page":"https://x.com/zenisek_m/status/1888533728399417411/photo/1","workerid":1}} {"timestamp":"2025-02-11T10:45:37.855Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":7,"images":2,"videos":0,"threads":1},"msg":"Loading Image: https://x.com/ZCrkvenjas/status/1780603901999529985/photo/1","page":"https://x.com/ZCrkvenjas/status/1780603901999529985/photo/1","workerid":0}} {"timestamp":"2025-02-11T10:45:39.853Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":6,"images":5,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/zenisek_m/status/1888533728399417411","page":"https://x.com/zenisek_m/status/1888533728399417411","workerid":1}} {"timestamp":"2025-02-11T10:45:40.220Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":8,"images":2,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/ZCrkvenjas/status/1780603901999529985","page":"https://x.com/ZCrkvenjas/status/1780603901999529985","workerid":0}} {"timestamp":"2025-02-11T10:45:43.931Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":6,"images":6,"videos":0,"threads":1},"msg":"Loading Image: https://x.com/zenisek_m/status/1888515168050597981/photo/1","page":"https://x.com/zenisek_m/status/1888515168050597981/photo/1","workerid":1}} {"timestamp":"2025-02-11T10:45:45.035Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":9,"images":2,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/ZCrkvenjas/status/1778042879711002852","page":"https://x.com/ZCrkvenjas/status/1778042879711002852","workerid":0}} {"timestamp":"2025-02-11T10:45:46.402Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":7,"images":6,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/zenisek_m/status/1888515168050597981","page":"https://x.com/zenisek_m/status/1888515168050597981","workerid":1}} {"timestamp":"2025-02-11T10:45:50.631Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":10,"images":2,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/ZCrkvenjas/status/1755623928079282469","page":"https://x.com/ZCrkvenjas/status/1755623928079282469","workerid":0}} {"timestamp":"2025-02-11T10:45:51.166Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":7,"images":7,"videos":0,"threads":1},"msg":"Loading Image: https://x.com/zenisek_m/status/1888292274775175357/photo/1","page":"https://x.com/zenisek_m/status/1888292274775175357/photo/1","workerid":1}} {"timestamp":"2025-02-11T10:45:53.116Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":8,"images":7,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/zenisek_m/status/1888292274775175357","page":"https://x.com/zenisek_m/status/1888292274775175357","workerid":1}} {"timestamp":"2025-02-11T10:45:55.625Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":11,"images":2,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/ZCrkvenjas/status/1755623930407092291","page":"https://x.com/ZCrkvenjas/status/1755623930407092291","workerid":0}} {"timestamp":"2025-02-11T10:45:57.045Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":8,"images":8,"videos":0,"threads":1},"msg":"Loading Image: https://x.com/zenisek_m/status/1888137042611102147/photo/1","page":"https://x.com/zenisek_m/status/1888137042611102147/photo/1","workerid":1}} {"timestamp":"2025-02-11T10:45:59.385Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":9,"images":8,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/zenisek_m/status/1888137042611102147","page":"https://x.com/zenisek_m/status/1888137042611102147","workerid":1}} {"timestamp":"2025-02-11T10:46:01.124Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":11,"images":3,"videos":0,"threads":1},"msg":"Loading Image: https://x.com/ZCrkvenjas/status/1755248224124686701/photo/1","page":"https://x.com/ZCrkvenjas/status/1755248224124686701/photo/1","workerid":0}} {"timestamp":"2025-02-11T10:46:04.063Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":12,"images":3,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/ZCrkvenjas/status/1755248224124686701","page":"https://x.com/ZCrkvenjas/status/1755248224124686701","workerid":0}} {"timestamp":"2025-02-11T10:46:04.181Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":9,"images":9,"videos":0,"threads":1},"msg":"Loading Image: https://x.com/zenisek_m/status/1887913841503162407/photo/1","page":"https://x.com/zenisek_m/status/1887913841503162407/photo/1","workerid":1}} {"timestamp":"2025-02-11T10:46:06.798Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":10,"images":9,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/zenisek_m/status/1887913841503162407","page":"https://x.com/zenisek_m/status/1887913841503162407","workerid":1}} {"timestamp":"2025-02-11T10:46:08.928Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":13,"images":3,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/ZCrkvenjas/status/1754537151402824093","page":"https://x.com/ZCrkvenjas/status/1754537151402824093","workerid":0}} {"timestamp":"2025-02-11T10:46:11.728Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":10,"images":10,"videos":0,"threads":1},"msg":"Loading Image: https://x.com/CzechEmbassyIL/status/1887784467495817405/photo/1","page":"https://x.com/CzechEmbassyIL/status/1887784467495817405/photo/1","workerid":1}} {"timestamp":"2025-02-11T10:46:11.984Z","logLevel":"warn","context":"behavior","message":"Behaviors timed out","details":{"seconds":90,"page":"https://twitter.com/zcrkvenjas","workerid":0}} {"timestamp":"2025-02-11T10:46:13.303Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":13,"images":4,"videos":0,"threads":1},"msg":"Loading Image: https://x.com/ZCrkvenjas/status/1704559764284260813/photo/1","page":"https://x.com/ZCrkvenjas/status/1704559764284260813/photo/1","workerid":0}} {"timestamp":"2025-02-11T10:46:13.571Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":11,"images":10,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/CzechEmbassyIL/status/1887784467495817405","page":"https://x.com/CzechEmbassyIL/status/1887784467495817405","workerid":1}} {"timestamp":"2025-02-11T10:46:15.561Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":14,"images":4,"videos":0,"threads":1},"msg":"Capturing Tweet: https://x.com/ZCrkvenjas/status/1704559764284260813","page":"https://x.com/ZCrkvenjas/status/1704559764284260813","workerid":0}} {"timestamp":"2025-02-11T10:46:17.238Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":11,"images":10,"videos":1,"threads":1},"msg":"Loading video for https://x.com/zenisek_m/status/1887855184291713198","page":"https://x.com/zenisek_m","workerid":1}} {"timestamp":"2025-02-11T10:46:17.537Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"tweets":12,"images":10,"videos":1,"threads":1},"msg":"Capturing Tweet: https://x.com/zenisek_m/status/1887855184291713198","page":"https://x.com/zenisek_m/status/1887855184291713198","workerid":1}} {"timestamp":"2025-02-11T10:46:18.070Z","logLevel":"warn","context":"behavior","message":"Behaviors timed out","details":{"seconds":90,"page":"https://twitter.com/zenisek_m","workerid":1}} {"timestamp":"2025-02-11T10:46:18.567Z","logLevel":"info","context":"pageStatus","message":"Page Finished","details":{"loadState":3,"page":"https://twitter.com/zcrkvenjas","workerid":0}} {"timestamp":"2025-02-11T10:46:18.887Z","logLevel":"warn","context":"behavior","message":"Behavior run partially failed","details":{"reason":{"type":"exception","message":"Protocol error (Runtime.evaluate): Target closed","stack":"TargetCloseError: Protocol error (Runtime.evaluate): Target closed\n at CallbackRegistry.clear (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/common/CallbackRegistry.js:77:36)\n at CdpCDPSession._onClosed (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/CDPSession.js:106:25)\n at Connection.onMessage (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/Connection.js:130:25)\n at WebSocket.<anonymous> (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/node/NodeWebSocketTransport.js:38:32)\n at callListener (/app/node_modules/puppeteer-core/node_modules/ws/lib/event-target.js:290:14)\n at WebSocket.onMessage (/app/node_modules/puppeteer-core/node_modules/ws/lib/event-target.js:209:9)\n at WebSocket.emit (node:events:518:28)\n at Receiver.receiverOnMessage (/app/node_modules/puppeteer-core/node_modules/ws/lib/websocket.js:1220:20)\n at Receiver.emit (node:events:518:28)\n at Immediate.<anonymous> (/app/node_modules/puppeteer-core/node_modules/ws/lib/receiver.js:601:16)"},"page":"https://twitter.com/zcrkvenjas","workerid":0}} {"timestamp":"2025-02-11T10:46:18.888Z","logLevel":"info","context":"behavior","message":"Behaviors finished","details":{"finished":1,"page":"https://twitter.com/zcrkvenjas","workerid":0}} {"timestamp":"2025-02-11T10:46:19.089Z","logLevel":"warn","context":"recorder","message":"Skipping URL from unknown frame","details":{"url":"https://api.x.com/live_pipeline/events?topic=%2Ftweet_engagement%2F1877760280739958931","frameId":"C5A03C1AF917E746FF41794DA3D77588"}} {"timestamp":"2025-02-11T10:46:19.198Z","logLevel":"info","context":"worker","message":"Starting page","details":{"workerid":0,"page":"https://www.facebook.com/AdamecIvan/"}} {"timestamp":"2025-02-11T10:46:19.198Z","logLevel":"info","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":1715,"total":31163,"pending":2,"failed":0,"limit":{"max":200000,"hit":false},"pendingPages":["{\"seedId\":1758,\"started\":\"2025-02-11T10:46:18.781Z\",\"extraHops\":0,\"url\":\"https:\\/\\/www.facebook.com\\/AdamecIvan\\/\",\"added\":\"2025-02-03T16:28:15.472Z\",\"depth\":0}","{\"seedId\":1756,\"started\":\"2025-02-11T10:44:36.075Z\",\"extraHops\":0,\"url\":\"https:\\/\\/twitter.com\\/zenisek_m\",\"added\":\"2025-02-03T16:28:15.472Z\",\"depth\":0}"]}} {"timestamp":"2025-02-11T10:46:19.535Z","logLevel":"info","context":"general","message":"Awaiting page load","details":{"page":"https://www.facebook.com/AdamecIvan/","workerid":0}} {"timestamp":"2025-02-11T10:46:20.107Z","logLevel":"warn","context":"recorder","message":"Skipping URL from unknown frame","details":{"url":"https://www.facebook.com/AdamecIvan/","frameId":"8C779155D4F94C00814AE60789EA831A"}} {"timestamp":"2025-02-11T10:46:20.552Z","logLevel":"info","context":"pageStatus","message":"Page Finished","details":{"loadState":3,"page":"https://twitter.com/zenisek_m","workerid":1}} {"timestamp":"2025-02-11T10:46:20.741Z","logLevel":"warn","context":"behavior","message":"Behavior run partially failed","details":{"reason":{"type":"exception","message":"Protocol error (Runtime.evaluate): Target closed","stack":"TargetCloseError: Protocol error (Runtime.evaluate): Target closed\n at CallbackRegistry.clear (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/common/CallbackRegistry.js:77:36)\n at CdpCDPSession._onClosed (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/CDPSession.js:106:25)\n at Connection.onMessage (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/Connection.js:130:25)\n at WebSocket.<anonymous> (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/node/NodeWebSocketTransport.js:38:32)\n at callListener (/app/node_modules/puppeteer-core/node_modules/ws/lib/event-target.js:290:14)\n at WebSocket.onMessage (/app/node_modules/puppeteer-core/node_modules/ws/lib/event-target.js:209:9)\n at WebSocket.emit (node:events:518:28)\n at Receiver.receiverOnMessage (/app/node_modules/puppeteer-core/node_modules/ws/lib/websocket.js:1220:20)\n at Receiver.emit (node:events:518:28)\n at Immediate.<anonymous> (/app/node_modules/puppeteer-core/node_modules/ws/lib/receiver.js:601:16)"},"page":"https://twitter.com/zenisek_m","workerid":1}} {"timestamp":"2025-02-11T10:46:20.746Z","logLevel":"info","context":"behavior","message":"Behaviors finished","details":{"finished":1,"page":"https://twitter.com/zenisek_m","workerid":1}} {"timestamp":"2025-02-11T10:46:20.876Z","logLevel":"warn","context":"recorder","message":"Skipping URL from unknown frame","details":{"url":"https://api.x.com/live_pipeline/events?topic=%2Ftweet_engagement%2F1889258330859753765","frameId":"EAD6CFF294DEABBEC9EA03D736541A43"}} {"timestamp":"2025-02-11T10:46:20.958Z","logLevel":"info","context":"worker","message":"Starting page","details":{"workerid":1,"page":"https://www.facebook.com/Andrea-Babi%C5%A1ov%C3%A1-464071084360774/"}} {"timestamp":"2025-02-11T10:46:20.959Z","logLevel":"info","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":1716,"total":31163,"pending":2,"failed":0,"limit":{"max":200000,"hit":false},"pendingPages":["{\"seedId\":1758,\"started\":\"2025-02-11T10:46:18.781Z\",\"extraHops\":0,\"url\":\"https:\\/\\/www.facebook.com\\/AdamecIvan\\/\",\"added\":\"2025-02-03T16:28:15.472Z\",\"depth\":0}","{\"seedId\":1760,\"started\":\"2025-02-11T10:46:20.678Z\",\"extraHops\":0,\"url\":\"https:\\/\\/www.facebook.com\\/Andrea-Babi%C5%A1ov%C3%A1-464071084360774\\/\",\"added\":\"2025-02-03T16:28:15.472Z\",\"depth\":0}"]}} {"timestamp":"2025-02-11T10:46:21.171Z","logLevel":"info","context":"general","message":"Awaiting page load","details":{"page":"https://www.facebook.com/Andrea-Babi%C5%A1ov%C3%A1-464071084360774/","workerid":1}} {"timestamp":"2025-02-11T10:46:21.348Z","logLevel":"warn","context":"recorder","message":"Skipping URL from unknown frame","details":{"url":"https://www.facebook.com/Andrea-Babi%C5%A1ov%C3%A1-464071084360774/","frameId":"DE370944A7CA8D9085202498621544DC"}} {"timestamp":"2025-02-11T10:46:21.621Z","logLevel":"warn","context":"recorder","message":"Skipping URL from unknown frame","details":{"url":"https://www.facebook.com/andrea.babisova.ano","frameId":"DE370944A7CA8D9085202498621544DC"}} {"timestamp":"2025-02-11T10:46:25.859Z","logLevel":"info","context":"general","message":"Seed page redirected, adding redirected seed","details":{"origUrl":"https://www.facebook.com/Andrea-Babi%C5%A1ov%C3%A1-464071084360774/","newUrl":"https://www.facebook.com/andrea.babisova.ano","seedId":3053}} {"timestamp":"2025-02-11T10:46:27.588Z","logLevel":"info","context":"behavior","message":"Running behaviors","details":{"frames":1,"frameUrls":["https://www.facebook.com/andrea.babisova.ano"],"page":"https://www.facebook.com/Andrea-Babi%C5%A1ov%C3%A1-464071084360774/","workerid":1}} {"timestamp":"2025-02-11T10:46:27.600Z","logLevel":"info","context":"behavior","message":"Run Script Started","details":{"frameUrl":"https://www.facebook.com/andrea.babisova.ano","page":"https://www.facebook.com/Andrea-Babi%C5%A1ov%C3%A1-464071084360774/","workerid":1}} {"timestamp":"2025-02-11T10:46:28.247Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"segments":1},"msg":"Skipping autoscroll, page seems to not be responsive to scrolling events","page":"https://www.facebook.com/andrea.babisova.ano","workerid":1}} {"timestamp":"2025-02-11T10:46:28.248Z","logLevel":"info","context":"behaviorScript","message":"Behavior log","details":{"state":{"segments":1},"msg":"done!","page":"https://www.facebook.com/andrea.babisova.ano","workerid":1}} {"timestamp":"2025-02-11T10:46:28.248Z","logLevel":"info","context":"behavior","message":"Run Script Finished","details":{"frameUrl":"https://www.facebook.com/andrea.babisova.ano","page":"https://www.facebook.com/Andrea-Babi%C5%A1ov%C3%A1-464071084360774/","workerid":1}} {"timestamp":"2025-02-11T10:46:28.249Z","logLevel":"info","context":"behavior","message":"Behaviors finished","details":{"finished":1,"page":"https://www.facebook.com/Andrea-Babi%C5%A1ov%C3%A1-464071084360774/","workerid":1}} {"timestamp":"2025-02-11T10:46:28.740Z","logLevel":"warn","context":"recorder","message":"Skipping URL from unknown frame","details":{"url":"https://www.fbsbx.com/maw_proxy_page/?__cci=FQARERESFQIZ9X4CGiQuQEZISkxOUFJYXF5gYmRqbHR4eoIBhAGGAYgBlAGcAZ4BoAGkAaoBuAHOAd4B4AHiAeoB7AHuAfAB9AH%2BAYAChgKWApoCoAKwAgQGCgwOEBIWGBweICImKCosMDI2ODo8qAKyAkJEZm5wdr4CfI4BkAGSAZYBmAGaAaIBpgK6AqgBrAGuAbABsgG0AboBvgHAAcIBxgHIAcoBzAHQAdQB2AHkAegB%2BAH6AfwBigKMAo4CkAKYAqICVFZygAGKAYwBGBB3d3cuZmFjZWJvb2suY29tAA%3D%3D.ARZqRS2RzhkyeGtkoVxgSNX_laCuUFg1OGZYBbhwfHlNjmi5","frameId":"2F2CDB1BDB11D87B26D60CFF0643998A"}} {"timestamp":"2025-02-11T10:46:30.405Z","logLevel":"info","context":"pageStatus","message":"Page Finished","details":{"loadState":4,"page":"https://www.facebook.com/Andrea-Babi%C5%A1ov%C3%A1-464071084360774/","workerid":1}} {"timestamp":"2025-02-11T10:46:30.661Z","logLevel":"info","context":"worker","message":"Starting page","details":{"workerid":1,"page":"https://www.facebook.com/AndrejBabis/"}} {"timestamp":"2025-02-11T10:46:30.662Z","logLevel":"info","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":1717,"total":31191,"pending":2,"failed":0,"limit":{"max":200000,"hit":false},"pendingPages":["{\"seedId\":1758,\"started\":\"2025-02-11T10:46:18.781Z\",\"extraHops\":0,\"url\":\"https:\\/\\/www.facebook.com\\/AdamecIvan\\/\",\"added\":\"2025-02-03T16:28:15.472Z\",\"depth\":0}","{\"seedId\":1761,\"started\":\"2025-02-11T10:46:30.660Z\",\"extraHops\":0,\"url\":\"https:\\/\\/www.facebook.com\\/AndrejBabis\\/\",\"added\":\"2025-02-03T16:28:15.472Z\",\"depth\":0}"]}} {"timestamp":"2025-02-11T10:46:30.857Z","logLevel":"info","context":"general","message":"Awaiting page load","details":{"page":"https://www.facebook.com/AndrejBabis/","workerid":1}} {"timestamp":"2025-02-11T10:46:32.968Z","logLevel":"error","context":"general","message":"Custom page load check timed out","details":{"seconds":5,"page":"https://www.facebook.com/AdamecIvan/","workerid":0}} {"timestamp":"2025-02-11T10:46:35.330Z","logLevel":"info","context":"behavior","message":"Running behaviors","details":{"frames":1,"frameUrls":["https://www.facebook.com/AndrejBabis/"],"page":"https://www.facebook.com/AndrejBabis/","workerid":1}} {"timestamp":"2025-02-11T10:46:35.330Z","logLevel":"info","context":"behavior","message":"Run Script Started","details":{"frameUrl":"https://www.facebook.com/AndrejBabis/","page":"https://www.facebook.com/AndrejBabis/","workerid":1}} {"timestamp":"2025-02-11T10:46:37.970Z","logLevel":"error","context":"general","message":"Link extraction timed out","details":{"seconds":5,"page":"https://www.facebook.com/AdamecIvan/","workerid":0}} {"timestamp":"2025-02-11T10:46:42.974Z","logLevel":"error","context":"general","message":"Timed out getting page title, something is likely wrong","details":{"seconds":5,"page":"https://www.facebook.com/AdamecIvan/","workerid":0}} {"timestamp":"2025-02-11T10:48:05.356Z","logLevel":"warn","context":"behavior","message":"Behaviors timed out","details":{"seconds":90,"page":"https://www.facebook.com/AndrejBabis/","workerid":1}} {"timestamp":"2025-02-11T10:48:06.361Z","logLevel":"info","context":"pageStatus","message":"Page Finished","details":{"loadState":3,"page":"https://www.facebook.com/AndrejBabis/","workerid":1}} {"timestamp":"2025-02-11T10:48:06.526Z","logLevel":"info","context":"worker","message":"Starting page","details":{"workerid":1,"page":"https://www.facebook.com/Belor.Roman.STAN/"}} {"timestamp":"2025-02-11T10:48:06.526Z","logLevel":"info","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":1718,"total":31201,"pending":2,"failed":0,"limit":{"max":200000,"hit":false},"pendingPages":["{\"seedId\":1758,\"started\":\"2025-02-11T10:46:18.781Z\",\"extraHops\":0,\"url\":\"https:\\/\\/www.facebook.com\\/AdamecIvan\\/\",\"added\":\"2025-02-03T16:28:15.472Z\",\"depth\":0}","{\"seedId\":1757,\"started\":\"2025-02-11T10:48:06.525Z\",\"extraHops\":0,\"url\":\"https:\\/\\/www.facebook.com\\/Belor.Roman.STAN\\/\",\"added\":\"2025-02-03T16:28:15.472Z\",\"depth\":0}"]}} {"timestamp":"2025-02-11T10:48:06.746Z","logLevel":"info","context":"general","message":"Awaiting page load","details":{"page":"https://www.facebook.com/Belor.Roman.STAN/","workerid":1}} {"timestamp":"2025-02-11T10:49:29.219Z","logLevel":"warn","context":"worker","message":"Page Worker Timeout","details":{"seconds":190,"page":"https://www.facebook.com/AdamecIvan/","workerid":0}} {"timestamp":"2025-02-11T10:49:29.222Z","logLevel":"info","context":"pageStatus","message":"Page Finished","details":{"loadState":2,"page":"https://www.facebook.com/AdamecIvan/","workerid":0}} {"timestamp":"2025-02-11T10:49:29.820Z","logLevel":"info","context":"general","message":"Saving crawl state to: /crawls/collections/2502_month_social/crawls/crawl-20250211104929-b7828328f8b4.yaml","details":{}} {"timestamp":"2025-02-11T10:49:29.983Z","logLevel":"info","context":"general","message":"Removing old save-state: /crawls/collections/2502_month_social/crawls/crawl-20250211102154-b7828328f8b4.yaml","details":{}} {"timestamp":"2025-02-11T10:49:30.085Z","logLevel":"info","context":"worker","message":"Starting page","details":{"workerid":0,"page":"https://www.facebook.com/adamkovavera/"}} {"timestamp":"2025-02-11T10:49:30.086Z","logLevel":"info","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":1719,"total":31201,"pending":2,"failed":0,"limit":{"max":200000,"hit":false},"pendingPages":["{\"seedId\":1759,\"started\":\"2025-02-11T10:49:30.085Z\",\"extraHops\":0,\"url\":\"https:\\/\\/www.facebook.com\\/adamkovavera\\/\",\"added\":\"2025-02-03T16:28:15.472Z\",\"depth\":0}","{\"seedId\":1757,\"started\":\"2025-02-11T10:48:06.525Z\",\"extraHops\":0,\"url\":\"https:\\/\\/www.facebook.com\\/Belor.Roman.STAN\\/\",\"added\":\"2025-02-03T16:28:15.472Z\",\"depth\":0}"]}} {"timestamp":"2025-02-11T10:49:30.304Z","logLevel":"info","context":"general","message":"Awaiting page load","details":{"page":"https://www.facebook.com/adamkovavera/","workerid":0}} {"timestamp":"2025-02-11T10:49:36.752Z","logLevel":"warn","context":"pageStatus","message":"Page Load Failed: will retry","details":{"retry":0,"retries":2,"msg":"Navigation timeout of 90000 ms exceeded","url":"https://www.facebook.com/Belor.Roman.STAN/","loadState":0,"page":"https://www.facebook.com/Belor.Roman.STAN/","workerid":1}} {"timestamp":"2025-02-11T10:49:36.766Z","logLevel":"warn","context":"behavior","message":"Behavior run partially failed","details":{"reason":{"type":"exception","message":"Protocol error (Runtime.evaluate): Target closed","stack":"TargetCloseError: Protocol error (Runtime.evaluate): Target closed\n at CallbackRegistry.clear (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/common/CallbackRegistry.js:77:36)\n at CdpCDPSession._onClosed (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/CDPSession.js:106:25)\n at Connection.onMessage (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/Connection.js:130:25)\n at WebSocket.<anonymous> (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/node/NodeWebSocketTransport.js:38:32)\n at callListener (/app/node_modules/puppeteer-core/node_modules/ws/lib/event-target.js:290:14)\n at WebSocket.onMessage (/app/node_modules/puppeteer-core/node_modules/ws/lib/event-target.js:209:9)\n at WebSocket.emit (node:events:518:28)\n at Receiver.receiverOnMessage (/app/node_modules/puppeteer-core/node_modules/ws/lib/websocket.js:1220:20)\n at Receiver.emit (node:events:518:28)\n at Immediate.<anonymous> (/app/node_modules/puppeteer-core/node_modules/ws/lib/receiver.js:601:16)"},"page":"https://www.facebook.com/AndrejBabis/","workerid":1}} {"timestamp":"2025-02-11T10:49:36.766Z","logLevel":"info","context":"behavior","message":"Behaviors finished","details":{"finished":1,"page":"https://www.facebook.com/AndrejBabis/","workerid":1}} {"timestamp":"2025-02-11T10:49:57.011Z","logLevel":"warn","context":"worker","message":"New Window Timed Out","details":{"seconds":20,"workerid":1}} {"timestamp":"2025-02-11T10:49:57.012Z","logLevel":"warn","context":"worker","message":"Error getting new page","details":{"workerid":1,"type":"exception","message":"timed out","stack":"Error: timed out\n at PageWorker.initPage (file:///app/dist/util/worker.js:95:27)\n at async PageWorker.runLoop (file:///app/dist/util/worker.js:228:30)\n at async PageWorker.run (file:///app/dist/util/worker.js:201:13)\n at async Promise.allSettled (index 1)\n at async runWorkers (file:///app/dist/util/worker.js:278:5)\n at async Crawler.crawl (file:///app/dist/crawler.js:1053:9)\n at async Crawler.run (file:///app/dist/crawler.js:359:13)\n at async file:///app/dist/main.js:58:1"}} {"timestamp":"2025-02-11T10:49:57.513Z","logLevel":"warn","context":"worker","message":"Retrying getting new page","details":{"page":"https://www.facebook.com/Belor.Roman.STAN/","workerid":1}} {"timestamp":"2025-02-11T10:50:17.525Z","logLevel":"warn","context":"worker","message":"New Window Timed Out","details":{"seconds":20,"workerid":1}} {"timestamp":"2025-02-11T10:50:17.526Z","logLevel":"warn","context":"worker","message":"Error getting new page","details":{"workerid":1,"type":"exception","message":"timed out","stack":"Error: timed out\n at PageWorker.initPage (file:///app/dist/util/worker.js:95:27)\n at async PageWorker.runLoop (file:///app/dist/util/worker.js:228:30)\n at async PageWorker.run (file:///app/dist/util/worker.js:201:13)\n at async Promise.allSettled (index 1)\n at async runWorkers (file:///app/dist/util/worker.js:278:5)\n at async Crawler.crawl (file:///app/dist/crawler.js:1053:9)\n at async Crawler.run (file:///app/dist/crawler.js:359:13)\n at async file:///app/dist/main.js:58:1"}} {"timestamp":"2025-02-11T10:50:18.027Z","logLevel":"warn","context":"worker","message":"Retrying getting new page","details":{"page":"https://www.facebook.com/Belor.Roman.STAN/","workerid":1}} {"timestamp":"2025-02-11T10:50:38.029Z","logLevel":"warn","context":"worker","message":"New Window Timed Out","details":{"seconds":20,"workerid":1}} {"timestamp":"2025-02-11T10:50:38.030Z","logLevel":"warn","context":"worker","message":"Error getting new page","details":{"workerid":1,"type":"exception","message":"timed out","stack":"Error: timed out\n at PageWorker.initPage (file:///app/dist/util/worker.js:95:27)\n at async PageWorker.runLoop (file:///app/dist/util/worker.js:228:30)\n at async PageWorker.run (file:///app/dist/util/worker.js:201:13)\n at async Promise.allSettled (index 1)\n at async runWorkers (file:///app/dist/util/worker.js:278:5)\n at async Crawler.crawl (file:///app/dist/crawler.js:1053:9)\n at async Crawler.run (file:///app/dist/crawler.js:359:13)\n at async file:///app/dist/main.js:58:1"}} {"timestamp":"2025-02-11T10:50:38.531Z","logLevel":"warn","context":"worker","message":"Retrying getting new page","details":{"page":"https://www.facebook.com/Belor.Roman.STAN/","workerid":1}} {"timestamp":"2025-02-11T10:50:58.553Z","logLevel":"warn","context":"worker","message":"New Window Timed Out","details":{"seconds":20,"workerid":1}} {"timestamp":"2025-02-11T10:50:58.553Z","logLevel":"warn","context":"worker","message":"Error getting new page","details":{"workerid":1,"type":"exception","message":"timed out","stack":"Error: timed out\n at PageWorker.initPage (file:///app/dist/util/worker.js:95:27)\n at async PageWorker.runLoop (file:///app/dist/util/worker.js:228:30)\n at async PageWorker.run (file:///app/dist/util/worker.js:201:13)\n at async Promise.allSettled (index 1)\n at async runWorkers (file:///app/dist/util/worker.js:278:5)\n at async Crawler.crawl (file:///app/dist/crawler.js:1053:9)\n at async Crawler.run (file:///app/dist/crawler.js:359:13)\n at async file:///app/dist/main.js:58:1"}} {"timestamp":"2025-02-11T10:50:59.054Z","logLevel":"warn","context":"worker","message":"Retrying getting new page","details":{"page":"https://www.facebook.com/Belor.Roman.STAN/","workerid":1}} {"timestamp":"2025-02-11T10:51:00.307Z","logLevel":"warn","context":"pageStatus","message":"Page Load Failed: will retry","details":{"retry":0,"retries":2,"msg":"Navigation timeout of 90000 ms exceeded","url":"https://www.facebook.com/adamkovavera/","loadState":0,"page":"https://www.facebook.com/adamkovavera/","workerid":0}} {"timestamp":"2025-02-11T10:51:00.325Z","logLevel":"warn","context":"general","message":"Link Extraction failed in frame","details":{"page":"https://www.facebook.com/AdamecIvan/","workerid":0,"type":"exception","message":"Protocol error (Runtime.callFunctionOn): Target closed","stack":"TargetCloseError: Protocol error (Runtime.callFunctionOn): Target closed\n at CallbackRegistry.clear (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/common/CallbackRegistry.js:77:36)\n at CdpCDPSession._onClosed (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/CDPSession.js:106:25)\n at Connection.onMessage (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/Connection.js:130:25)\n at WebSocket.<anonymous> (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/node/NodeWebSocketTransport.js:38:32)\n at callListener (/app/node_modules/puppeteer-core/node_modules/ws/lib/event-target.js:290:14)\n at WebSocket.onMessage (/app/node_modules/puppeteer-core/node_modules/ws/lib/event-target.js:209:9)\n at WebSocket.emit (node:events:518:28)\n at Receiver.receiverOnMessage (/app/node_modules/puppeteer-core/node_modules/ws/lib/websocket.js:1220:20)\n at Receiver.emit (node:events:518:28)\n at Immediate.<anonymous> (/app/node_modules/puppeteer-core/node_modules/ws/lib/receiver.js:601:16)"}} {"timestamp":"2025-02-11T10:51:00.332Z","logLevel":"info","context":"behavior","message":"Running behaviors","details":{"frames":1,"frameUrls":["https://www.facebook.com/AdamecIvan/"],"page":"https://www.facebook.com/AdamecIvan/","workerid":0}} {"timestamp":"2025-02-11T10:51:00.332Z","logLevel":"info","context":"behavior","message":"Run Script Started","details":{"frameUrl":"https://www.facebook.com/AdamecIvan/","page":"https://www.facebook.com/AdamecIvan/","workerid":0}} {"timestamp":"2025-02-11T10:51:00.336Z","logLevel":"warn","context":"behavior","message":"Behavior run partially failed","details":{"reason":{"type":"exception","message":"Protocol error (Runtime.evaluate): Session closed. Most likely the page has been closed.","stack":"TargetCloseError: Protocol error (Runtime.evaluate): Session closed. Most likely the page has been closed.\n at CdpCDPSession.send (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/CDPSession.js:64:35)\n at Browser.evaluateWithCLI (file:///app/dist/util/browser.js:210:56)\n at file:///app/dist/crawler.js:820:89\n at Array.map (<anonymous>)\n at Crawler.runBehaviors (file:///app/dist/crawler.js:820:61)\n at Crawler.doPostLoadActions (file:///app/dist/crawler.js:734:49)\n at async Crawler.crawlPage (file:///app/dist/crawler.js:688:9)\n at async PageWorker.crawlPage (file:///app/dist/util/worker.js:159:21)"},"page":"https://www.facebook.com/AdamecIvan/","workerid":0}} {"timestamp":"2025-02-11T10:51:00.336Z","logLevel":"info","context":"behavior","message":"Behaviors finished","details":{"finished":1,"page":"https://www.facebook.com/AdamecIvan/","workerid":0}} {"timestamp":"2025-02-11T10:51:19.068Z","logLevel":"warn","context":"worker","message":"New Window Timed Out","details":{"seconds":20,"workerid":1}} {"timestamp":"2025-02-11T10:51:19.069Z","logLevel":"warn","context":"worker","message":"Error getting new page","details":{"workerid":1,"type":"exception","message":"timed out","stack":"Error: timed out\n at PageWorker.initPage (file:///app/dist/util/worker.js:95:27)\n at async PageWorker.runLoop (file:///app/dist/util/worker.js:228:30)\n at async PageWorker.run (file:///app/dist/util/worker.js:201:13)\n at async Promise.allSettled (index 1)\n at async runWorkers (file:///app/dist/util/worker.js:278:5)\n at async Crawler.crawl (file:///app/dist/crawler.js:1053:9)\n at async Crawler.run (file:///app/dist/crawler.js:359:13)\n at async file:///app/dist/main.js:58:1"}} {"timestamp":"2025-02-11T10:51:19.069Z","logLevel":"error","context":"worker","message":"Worker error, exiting","details":{"type":"exception","message":"Unable to load new page, browser needs restart","stack":"Error: Unable to load new page, browser needs restart\n at PageWorker.initPage (file:///app/dist/util/worker.js:147:27)\n at async PageWorker.runLoop (file:///app/dist/util/worker.js:228:30)\n at async PageWorker.run (file:///app/dist/util/worker.js:201:13)\n at async Promise.allSettled (index 1)\n at async runWorkers (file:///app/dist/util/worker.js:278:5)\n at async Crawler.crawl (file:///app/dist/crawler.js:1053:9)\n at async Crawler.run (file:///app/dist/crawler.js:359:13)\n at async file:///app/dist/main.js:58:1","workerid":1}} {"timestamp":"2025-02-11T10:51:20.427Z","logLevel":"warn","context":"worker","message":"New Window Timed Out","details":{"seconds":20,"workerid":0}} {"timestamp":"2025-02-11T10:51:20.427Z","logLevel":"warn","context":"worker","message":"Error getting new page","details":{"workerid":0,"type":"exception","message":"timed out","stack":"Error: timed out\n at PageWorker.initPage (file:///app/dist/util/worker.js:95:27)\n at async PageWorker.runLoop (file:///app/dist/util/worker.js:228:30)\n at async PageWorker.run (file:///app/dist/util/worker.js:201:13)\n at async Promise.allSettled (index 0)\n at async runWorkers (file:///app/dist/util/worker.js:278:5)\n at async Crawler.crawl (file:///app/dist/crawler.js:1053:9)\n at async Crawler.run (file:///app/dist/crawler.js:359:13)\n at async file:///app/dist/main.js:58:1"}} {"timestamp":"2025-02-11T10:51:20.928Z","logLevel":"warn","context":"worker","message":"Retrying getting new page","details":{"page":"https://www.facebook.com/adamkovavera/","workerid":0}} {"timestamp":"2025-02-11T10:51:20.929Z","logLevel":"error","context":"worker","message":"Worker error, exiting","details":{"type":"exception","message":"no page available, shouldn't get here","stack":"Error: no page available, shouldn't get here\n at PageWorker.initPage (file:///app/dist/util/worker.js:156:15)\n at async PageWorker.runLoop (file:///app/dist/util/worker.js:228:30)\n at async PageWorker.run (file:///app/dist/util/worker.js:201:13)\n at async Promise.allSettled (index 0)\n at async runWorkers (file:///app/dist/util/worker.js:278:5)\n at async Crawler.crawl (file:///app/dist/crawler.js:1053:9)\n at async Crawler.run (file:///app/dist/crawler.js:359:13)\n at async file:///app/dist/main.js:58:1","workerid":0}} {"timestamp":"2025-02-11T10:51:21.608Z","logLevel":"info","context":"general","message":"Saving crawl state to: /crawls/collections/2502_month_social/crawls/crawl-20250211105120-b7828328f8b4.yaml","details":{}} {"timestamp":"2025-02-11T10:51:21.773Z","logLevel":"info","context":"general","message":"Removing old save-state: /crawls/collections/2502_month_social/crawls/crawl-20250211102758-b7828328f8b4.yaml","details":{}} {"timestamp":"2025-02-11T10:51:21.795Z","logLevel":"info","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":1719,"total":31201,"pending":2,"failed":0,"limit":{"max":200000,"hit":false},"pendingPages":["{\"seedId\":1763,\"started\":\"2025-02-11T10:51:00.423Z\",\"extraHops\":0,\"url\":\"https:\\/\\/www.facebook.com\\/bacikovajana\\/\",\"added\":\"2025-02-03T16:28:15.472Z\",\"depth\":0}","{\"seedId\":1762,\"started\":\"2025-02-11T10:49:36.991Z\",\"extraHops\":0,\"url\":\"https:\\/\\/www.facebook.com\\/babkaondrej\\/\",\"added\":\"2025-02-03T16:28:15.472Z\",\"depth\":0}"]}} {"timestamp":"2025-02-11T10:51:21.796Z","logLevel":"info","context":"general","message":"Crawling done","details":{}} {"timestamp":"2025-02-11T10:51:21.796Z","logLevel":"info","context":"general","message":"Merging CDX","details":{}} {"timestamp":"2025-02-11T10:51:39.399Z","logLevel":"info","context":"general","message":"Exiting, Crawl status: interrupted","details":{}}

@ikreymer
Copy link
Member

Are you sure there are no WARCs generated? The crawler is interrupted, but should still have WARC files on disk (but not WACZ since it hasn't finished yet).

Yes, the latest version of the browser, we've been seeing this error more often, which results in having to restart the crawler. However, the data should still be there in the ./crawls/collections//archives directory and you should be able to restart..

We're looking into why its happening, something changed in the browser perhaps..

@ldko
Copy link

ldko commented Feb 12, 2025

I am also seeing issues that seem related to the browser. WARCs are generated (but no WACZ).

I was trying to run another set of EOT crawls similar to ones I ran in early January where I launch 1 crawl at a time (2 workers, 6 second delay on page requests) from a bash script that contains 11 docker crawl commands. Only 2 of the crawls actually produced a WACZ. I had pulled another image from a couple days ago where similarly I was seeing the crawls end with a "Exiting, Crawl status: interrupted" message. Toward the end of the logs for these crawls I start to see messages like:

{"timestamp":"2025-02-11T22:44:28.805Z","logLevel":"warn","context":"worker","message":"Retrying getting new page","details":{"page":"https://www.acf.hhs.gov/ecd?page=1","workerid":0}}
{"timestamp":"2025-02-11T22:44:28.806Z","logLevel":"debug","context":"worker","message":"Getting page in new window","details":{"workerid":0}}
{"timestamp":"2025-02-11T22:44:29.405Z","logLevel":"warn","context":"worker","message":"New Window Timed Out","details":{"seconds":20,"workerid":1}}
{"timestamp":"2025-02-11T22:44:29.406Z","logLevel":"warn","context":"worker","message":"Error getting new page","details":{"workerid":1,"type":"exception","message":"timed out","stack":"Error: timed out\n    at PageWorker.initPage (file:///app/dist/util/worker.js:95:27)\n    at async PageWorker.runLoop (file:///app/dist/util/worker.js:228:30)\n    at async PageWorker.run (file:///app/dist/util/worker.js:201:13)\n    at async Promise.allSettled (index 1)\n    at async runWorkers (file:///app/dist/util/worker.js:278:5)\n    at async Crawler.crawl (file:///app/dist/crawler.js:1053:9)\n    at async Crawler.run (file:///app/dist/crawler.js:359:13)\n    at async file:///app/dist/main.js:58:1"}}
{"timestamp":"2025-02-11T22:44:29.906Z","logLevel":"warn","context":"worker","message":"Retrying getting new page","details":{"page":"https://www.acf.hhs.gov/ecd?page=2","workerid":1}}
{"timestamp":"2025-02-11T22:44:29.907Z","logLevel":"debug","context":"worker","message":"Getting page in new window","details":{"workerid":1}}
{"timestamp":"2025-02-11T22:44:48.825Z","logLevel":"warn","context":"worker","message":"New Window Timed Out","details":{"seconds":20,"workerid":0}}
{"timestamp":"2025-02-11T22:44:48.826Z","logLevel":"warn","context":"worker","message":"Error getting new page","details":{"workerid":0,"type":"exception","message":"timed out","stack":"Error: timed out\n    at PageWorker.initPage (file:///app/dist/util/worker.js:95:27)\n    at async PageWorker.runLoop (file:///app/dist/util/worker.js:228:30)\n    at async PageWorker.run (file:///app/dist/util/worker.js:201:13)\n    at async Promise.allSettled (index 0)\n    at async runWorkers (file:///app/dist/util/worker.js:278:5)\n    at async Crawler.crawl (file:///app/dist/crawler.js:1053:9)\n    at async Crawler.run (file:///app/dist/crawler.js:359:13)\n    at async file:///app/dist/main.js:58:1"}}
{"timestamp":"2025-02-11T22:44:48.826Z","logLevel":"error","context":"worker","message":"Worker error, exiting","details":{"type":"exception","message":"Unable to load new page, browser needs restart","stack":"Error: Unable to load new page, browser needs restart\n    at PageWorker.initPage (file:///app/dist/util/worker.js:147:27)\n    at async PageWorker.runLoop (file:///app/dist/util/worker.js:228:30)\n    at async PageWorker.run (file:///app/dist/util/worker.js:201:13)\n    at async Promise.allSettled (index 0)\n    at async runWorkers (file:///app/dist/util/worker.js:278:5)\n    at async Crawler.crawl (file:///app/dist/crawler.js:1053:9)\n    at async Crawler.run (file:///app/dist/crawler.js:359:13)\n    at async file:///app/dist/main.js:58:1","workerid":0}}
{"timestamp":"2025-02-11T22:44:49.909Z","logLevel":"warn","conte 

@zlodejpapiru
Copy link
Author

@ikreymer You are right - warcs are there, wacz not. I tried to rerun the crawl but keeps crashing.

@gitreich
Copy link
Contributor

Same here, but it only happens with large crawls running longer then 24 hours and stucking on seeds after at least 500 crawled seeds. I thought it's depending on unavailable websites, but it seems like this is not true, and it may also happen if the page is available and does response something correctly. Also on our system warcs are written until the crash.
I use networkidle0 and thought of changing it;
The yaml contains 20.000 seeds build out of 5000 domains with http, http,http://www. and https://www.
Here is a example of the docker start cmd:

docker run -d --name ONB_Btrix_dc_at_test5_BULK_2_20250211013617 -e NODE_OPTIONS='--max-old-space-size=32768' -p 42555:42555 -p 33391:33391 -v /data/browsertrix/crawls/:/crawls/ webrecorder/browsertrix-crawler:1.5.3 crawl --screencastPort 42555 --healthCheckPort 33391 --scopeType domain --headless --delay 0 --behaviorTimeout 60 --pageLoadTimeout 60 --waitUntil networkidle0 --saveState always --logging stats,info --config /crawls/config/dc_at_test5_BULK_2_20250211013617.yaml --depth 0 --workers 1 --warcInfo ONB_CRAWL_dc_at_test5_BULK_2_20250211013617_Depth_0_20250211013618 --userAgentSuffix +ONB_Bot_Btrix_1.5.3, [email protected] --crawlId id_ONB_CRAWL_dc_at_test5_BULK_2_20250211013617_Depth_0_20250211013618 --collection dc_at_test5_BULK_2_20250211013617

As mentioned with the same version there are about 100 other crawls daily, which does not reach this point, but they have all limits set, and are closed by reaching them

@ikreymer
Copy link
Member

Thanks for the reproducible tests!

Did some testing, can confirm this is happening with Chrome 133, but not Chrome 134 (Beta) or Chrome 135 (Canary).
As soon as there's Brave release on Chromium 134, can try that.

ikreymer added a commit that referenced this issue Feb 19, 2025
should fix browser timing out on new window, fixes #766
bump to 1.5.4
@ikreymer
Copy link
Member

Looks like there's a release of Brave with Chromium 134, going to test that and release 1.5.4. On initial testing, it looks like this issue disappears in Chromium 134.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done!
Development

Successfully merging a pull request may close this issue.

4 participants