Skip to content

how to scrape the infinte long javascript render (dynamic page content) #498

Closed Answered by mllife
mllife asked this question in Forums - Q&A
Discussion options

You must be logged in to vote

was able to do it
`python
import asyncio
from crawl4ai import AsyncWebCrawler, BrowserConfig, CrawlerRunConfig, CacheMode

async def scrape_ecb_publications():
# Configure the browser settings
browser_config = BrowserConfig(headless=True, java_script_enabled=True)

# Set run configurations, including cache mode and enabling full-page scan
crawl_config = CrawlerRunConfig(
    cache_mode=CacheMode.BYPASS,
    scan_full_page=True,
    scroll_delay=1.5,
)

async with AsyncWebCrawler(config=browser_config) as crawler:
    result = await crawler.arun(
        url="https://www.ecb.europa.eu/press/pubbydate/html/index.en.html?year=2024",
        config=crawl_config
    )

    # Output the extracted…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by mllife
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
1 participant