Skip to content

extract rules for using LLM, and use it for non-ai #448

Discussion options

You must be logged in to vote

@AzizNadirov To best of my knowledge, we currently don't have a feature like this. We will certainly keep this use case in mind while planning our future roadmap.

However you can get the raw html from the crawler result using result.html then have a model(like chatGPT or Claude) to workout the mapping between classes/id/name etc attributes of divs vs desired data fields. Then you can extract using the help of JsonCssExtractionStrategy or JsonXPathExtractionStrategy .

You can find some useful examples here

Cc: @unclecode Interesting use case ☝🏼

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@AzizNadirov
Comment options

@aravindkarnam
Comment options

Answer selected by AzizNadirov
@AzizNadirov
Comment options

@unclecode
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants