-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OBO Purl System: Add HTTPS support #705
Comments
I've been thinking about this since I was told about the original issue last week. Doing this properly would require significant changes to the system, and every change to the PURL system must be made very carefully. If someone want to volunteer to do that work, that would be great. Otherwise this will have to wait. |
Seems that Chrome is now blocking our purls, at least using the right-click save file as method.. Try here: |
How many person hours would we need to enable https @jamesaoverton ? |
On a personal website I just set up https in about 5min. Here is some info: https://letsencrypt.org/. My server is on AWS and I use a bitnami wordpress image, here is the documentation for bitnami https. https://docs.bitnami.com/aws/how-to/understand-bncert/ Sofia |
Oh! and it is free!! |
The problems are the PURLs themselves.. We cant just change 2000 (or, if we need to change terms, 2 million) purls from an http scheme to an https scheme. Anyways we will sleep on this a bit. |
Yes, I've used Let's Encrypt for a large number of websites, and set them up to automatically redirect HTTP to HTTPS. It's great and we'll probably use Let's Encrypt for this too. But the PURL system is not a website. It's a system to manage millions of permanent identifiers for hundreds of projects used by who-knows-how-many downstream users, tools, and databases. If we make a mistake we have to live with it forever. So every change has to be made very carefully. So I'll set aside time for this, but it can't be rushed. |
Ah! Sounds complicated :/ |
I believe that this Chromium Blog posting explains the issue, which is specifically about mixing HTTP content (images, downloads) in a page that is served by HTTPS: https://blog.chromium.org/2020/02/protecting-users-from-insecure.html |
I think at least for the ontology purls (not terms), we will have to come up with some kind of plan for addressing this issue. For example, https://hpo.jax.org/app/download/annotation you cannot click on any of the OBO purls in there! |
They work for me... (Safari on iPadOS 15.3.1) The best option is probably to allow HTTP and HTTPS in parallel for all PURLs, but I'm still worried that people will cause themselves all sorts of problems once we allow this. |
I tried only Chrome on Mac, I guess Chrome is generally now blocking all http referrals from https sites! |
Chrome, "Version 100.0.4896.88 (Official Build) (64-bit)" on Linux works for me. |
@matentzn Clicking on that works for me. It sounds like behavior I ran into with an HTTPS everywhere plugin I used to run. I would suggest trying with a temporary clean profile. As well, if you could report your version number, we could see if something recent has changed in Chrome. |
Alas. We're on the same version and I can guarantee I'm on a completely unaltered installation, so I'd still put the weight on local issue. Possibly work through https://support.google.com/chrome/thread/16888999/links-won-t-open-in-chrome?hl=en |
Why would this be the case? Because the ontologies will then have links to both or only one in some case? @jamesaoverton |
Many of our current PURL configuration entries point to HTTP URLs, but redirecting HTTPS to HTTP raises security warnings in many clients (e.g. browsers). If one resource has both HTTP and HTTPS URLs, clients may not recognize that the two refer to the same thing. This is a recognized problem for the RDF stack, and I haven't seen a solution with broad support. |
I notice that linking to an HTTP ontology IRI in an HTTPS page is blocked by Chrome, and that seems to be primarily because the target is a file download. If you link to an HTTP term IRI, which generally redirects to a viewable page, Chrome seems to be fine with that. Maybe we could pursue an initial solution of just migrating ontology IRIs to HTTPS (of which there are far fewer than term IRIs). |
After tons of the discussions now about this, I would like to propose the same thing. I think despite the obvious problems with having http uris for terms, we should not change that, as the URI is primarily an identifier and secondarily a URL, regardless of what people may think when they look at it. Without making and statement about who will deal with this problem, I would like to propose this:
To get 1 out of the way: Proposal: Add https support for the OBO purl server and migrate ontology PURLs to https
|
http://obofoundry.org is handled by GitHub Pages, not the PURL server. That's a completely separate issue. Please don't make this PURL system issue more complicated than it already is. The only solution that I see is for the PURL server to support both HTTP and HTTPS in parallel. We should not automatically redirect HTTP to HTTPS, or it will break redirects specified here as HTTP. Users will have to specify HTTP or HTTPS as appropriate, which will lead to all sorts of confusion. Technically, this should just be a matter of getting a certificate and duplicating the Apache VirtualHost config for SSL in port 443: https://github.com/OBOFoundry/purl.obolibrary.org/blob/master/tools/etc_apache2_sites-available_site.j2 LBL is running the PURL infrastructure now, so changes will have to be coordinated with them: @kltm and @cmungall. |
Comment from OBO ops meeting: we should not turn on HTTPS for obofoundry.org until we support downloading ontology PURLs via HTTPS. The reason is that Chrome will not allow clicking an HTTP download from an HTTPS page, so any direct ontology downloads would be broken from an HTTPS obofoundry.org. |
Is this the action for @kltm?
|
Yes, I'd appreciate @kltm's input on that. |
To clarify, what we're talking about here is:
If so, we can confer with @abessiari about changing the image. We could also try and just toss Cloudflare in front of it (with the bonus of maybe speeding things up and saving a wee bit of money). Edit: After some poking around I believe this could be done with Cloudflare only, but would likely require a little fiddling which might result in a few small outages. Using Cloudflare would marginally decrease costs, but increase the number of control planes. Unless it becomes complicated for some reason, I think an addition to the current system would probably be better for now, with an eye on cert renewal or longer spans. |
A set of concrete test URLs would also be useful. |
Talking to @cmungall, it might be nice to try the Cloudflare version of the solution first, then do it with the community infrastructure. |
Fine by me. Nico suggested these HTTP examples above:
Currently the HTTPS versions of those do no resolve, but after this change they should resolve to the same targets as their HTTP counterparts: |
We had a test earlier where we had trouble getting a Cloudflare cert; I've tried again using the subdomain http://purssl.obolibrary.org/obo/hp.obo Please try your favorite variant of http(s)://purssl.obolibrary.org (making note of upgrades); if all goes well, we can try the |
Thanks @kltm! These work for me. I'll think a bit more about what other tests to run and get back to you tomorrow. Let's call this solution "HTTP(S) in parallel". I've given it some more thought and I want to discuss it in depth before we commit to it: HTTP(S) in parallel solves an immediate problem: People have an HTTPS webpage and they want to use PURLs to link to downloads or images that are served via HTTPS. If they use HTTP PURLs (which is the status quo) they get security warnings: HTTPS downgrade to HTTP PURL. With this solution they can use HTTPS PURLs and the whole chain of redirects uses HTTPS. Great! People can still run into problems if their resources are served via HTTP. A partial solution to that would to update the PURL configs to redirect to HTTPS resources where possible. Most of the PURLs redirect to GitHub or a few sites, so we could automate much of that update, and we could add automated testing. HTTP(S) in parallel does not solve the problem of IRIs as identifiers in the RDF stack. RDF considers http://purl.obolibrary.org/obo/obi.owl and https://purl.obolibrary.org/obo/obi.owl to be two distinct identifiers. Even if we know that they are "the same thing", our tools won't know that. So if (when) people start mixing HTTP and HTTPS in their ontology IRIs, version IRIs, and term IRIs, they will get into all sorts of trouble, and it will be hard to see the extra "s" that's the cause. So the question is: Should we address the identifier problem by modifying the HTTP(S) in parallel approach to serve some IRI patterns over HTTP but not HTTPS? In other words, should we carve out patterns that we usually treat as identifiers, and refuse to support HTTPS for them? My answer is no, I don't think that's viable. A key problem we're trying to solve is downloading ontology files, so excluding ontology IRIs and version IRIs from HTTPS support leaves that problem wide open. Term IRIs are a slightly different case, but I still don't think we want to make an exception for them. So I still think HTTP(S) in parallel is the best option, without trying to get fancier by carving out certain patterns. Instead I think we need to address the identifier problem at another level: add ROBOT report checks and other tests that will scream bloody murder when they see HTTPS PURLs used for ontology IRIs like https://purl.obolibrary.org/obo/obi.owl or for term IRIs like https://purl.obolibrary.org/obo/OBI_0000070. Sorry for the long post, but I think it's important to get this right. |
@jamesaoverton Okay, I want to clarify my action here: I'm going to wait until further notice before trying the "parallel" option again. Until then, the test URL set above will remain in place. |
@kltm My previous comment got the support I was looking for. I can't think of any other tests to run for There's no particular rush. We'll do some more tests before we advertise the change. Thank you! |
I agree with @jamesaoverton's points and chain of reasoning. As additional context for people coming to this thread, the w3c have this post: Linked from this discussion: Which mentions schema.org. I think they are in a bit of a mess, with some groups using https for identifiers, some using http. Let's not end up where they are. We need to keep hammering home the point that the identifiers for OBO are http, regardless of redirects and what the browser bar says. But this is a social and documentation issue, not a technical one, so once @kltm adds HTTPS support to the purl.obolibrary.org subdomain, we can close this issue and continue any further discussion on the main OBO tracker. |
@jamesaoverton Now ready for testing. |
@jamesaoverton Is this closed? |
Yes, I'll close this issue. The next step is to update our configs to point to HTTPS targets when possible: #925 |
Hello.
I currently have images that I was using the purl.obolibrary.org redirect to my ontology github to serve them on OLS. Chrome is requiring https, and purl.obolibrary does not support https.
Are there any plans to support https in the future?
Thank you,
Sofia
The text was updated successfully, but these errors were encountered: