-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
html_element() cannot select itself #382
Comments
When a CSS selector is passed to html_elements(div_children, xpath = selectr::css_to_xpath(".my-class"))
#> {xml_nodeset (1)}
#> [1] <h1 class="my-class">Hello</h1> Created on 2023-12-22 with reprex v2.0.2 I believe this issue could be fixed in |
@rossellhayes well that beats my suggestion: x <- xml_new_root("tmp")
for (child in div_children) {
xml_add_child(x, child)
}
html_elements(x, ".my-class") |
Thanks @rossellhayes. I'm wondering if there's something more going on here that I'm not able to grasp or is actually a bug whereas the previous was not per your findings in the test. Using the library(xml2)
library(rvest)
html <- minimal_html(r"{
<div class="div-class">
<h1 class="my-class">Hello</h1>
<h2 class="subclass">World</h2>
</div>
}")
html_elements(html, ".div-class .my-class")
#> {xml_nodeset (1)}
#> [1] <h1 class="my-class">Hello</h1>
div_children <- html_elements(html, ".div-class") |>
html_children()
# select using selectr
html_elements(div_children, xpath = selectr::css_to_xpath(".my-class"))
#> {xml_nodeset (1)}
#> [1] <h1 class="my-class">Hello</h1>
# remove the node using xml_remove
xml2::xml_remove(
html_elements(div_children, xpath = selectr::css_to_xpath(".my-class"))
)
# see if its still there
html_elements(div_children, xpath = selectr::css_to_xpath(".my-class"))
#> {xml_nodeset (1)}
#> [1] <h1 class="my-class">Hello</h1>
# repeat at the top level html
xml2::xml_remove(html_elements(html, ".div-class .my-class"))
# see if it is still there
html_elements(html, ".div-class .my-class")
#> {xml_nodeset (0)} Created on 2023-12-22 with reprex v2.0.2 EDIT: ignore me. It seems # remove the node using xml_remove
xml2::xml_remove(
html_elements(div_children, xpath = selectr::css_to_xpath(".my-class")),
free = TRUE
) |
|
After using
html_children()
the contents cannot be access usinghtml_element()
orhtml_elements()
.I would not be surprised if this is user error, I'm just not sure where.
Created on 2023-12-22 with reprex v2.0.2
The text was updated successfully, but these errors were encountered: