Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selector h4+ p is incorrectly translated to xpath #28

Open
jstray opened this issue May 25, 2019 · 1 comment
Open

Selector h4+ p is incorrectly translated to xpath #28

jstray opened this issue May 25, 2019 · 1 comment

Comments

@jstray
Copy link

jstray commented May 25, 2019

On the page https://www.kpu.ca/calendar/2018-19/courses/jrnl/index.html, I'm trying to select the paragraph of course description that follows each course title. For example, "Students will explore how journalism fits in a media landscape..."

I can successfully highlight the appropriate elements in SG by clicking on this paragraph, then clicking on one of the "Prerequisites" elements to prevent them from being included. This results in the correct CSS selector h4+ p

However, when I translate this to an Xpath, I get //h4+//p which is not correct. I would expect this to translate to something like //h4/following::p[1], which gives the correct result.

We have been advising people to use SelectorGadget to write the expressions for the Xpath Extractor in Workbench (http://help.workbenchdata.com/steps/scrape/xpath-extractor) as a way to avoid learning the xpath syntax, so it's unfortunate that this case is mis-translated.

@cantino
Copy link
Owner

cantino commented May 26, 2019

Hey @jstray, thanks for the bug report! I'm not very actively maintaining Selector Gadget these days. I'll mark this as help wanted, and hopefully someone will send in a PR. You're also more than welcome to submit a fix. I don't think it'd be too hard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants