Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Horizontal Scaling: Read-Only #71

Open
chris-allan opened this issue Jan 25, 2017 · 5 comments
Open

Horizontal Scaling: Read-Only #71

chris-allan opened this issue Jan 25, 2017 · 5 comments

Comments

@chris-allan
Copy link
Member

chris-allan commented Jan 25, 2017

History, state of play, and rationale

Read-only versions of OMERO are applicable to many use cases. A few are outlined below:

  1. Adding more READ capacity and load balancing an OMERO server cluster; either on the same machine or on separate machines.

  2. Allow for operations upon a read-only PostgreSQL database snapshot or replica. Examples of this include, but are not limited to: full-text indexing, expensive query isolation, and immutability.

Overview

Phase I

The first major step is to enable a read-only version of the OMERO server. This requires read-only access to the filesystem and database. A crucial component of this work is to divorce the session store from the OMERO instance. Initially pure, in memory sessions will be utilized when the server is started in read-only mode to avoid the current read-write requirement on the session store. The default session storage is currently the database. As a consequence, sessions will not be portable between servers when started in read-only mode.

The result of this work would be an ability to specify an alternate hostname:port combination and get a connection to an OMERO server instance that would only allow read-only operations. Attempted write operations should throw an exception so that mistaken attempts to use write operations can be handled. Additionally, clients would be able to ask the server on login or a service on retrieval whether or not a read-only flag had been set to avoid such exceptions.

This would of facilitate load balancing and utilization of the read-only instance "by configuration".

Phase II

Building on phase I, this phase would be the pursuit of cluster-wide session storage, allowing for the portability of sessions between instances regardless of the running mode, read-only or read-write, of the server.

Current work

Creating a read-only database user for testing

Assuming that you have created OMERO databases and set owner of that database to "omero" protected by password authentication you can create a new user "omeroro" with a password of your choice:

$ createuser -P 'omeroro'
Enter password for new role:
Enter it again:

If you then connect to the OMERO database with this user, as allowed by pg_hba.conf, you will be able to list the tables (interrogate the schema) but not perform queries against the database:

$ psql -h localhost -U omeroro omero
psql (9.3.2)
Type "help" for help.

omero=> \dt
                             List of relations
 Schema |                       Name                       | Type  | Owner
--------+--------------------------------------------------+-------+-------
 public | _fs_deletelog                                    | table | omero
 public | _lock_ids                                        | table | omero
 public | acquisitionmode                                  | table | omero
 public | annotation                                       | table | omero
 public | annotationannotationlink                         | table | omero
 public | arc                                              | table | omero
 public | arctype                                          | table | omero
 public | binning                                          | table | omero
…
omero=> SELECT * FROM dbpatch;
ERROR:  permission denied for relation dbpatch

You can then, as a PostgreSQL superuser, GRANT the "omeroro" user the ability to run SELECT statements on the database:

$ psql omero
psql (9.3.2)
Type "help" for help.

omero=# \dn+
                       List of schemas
  Name  | Owner  | Access privileges |      Description
--------+--------+-------------------+------------------------
 public | callan | callan=UC/callan +| standard public schema
        |        | =UC/callan        |
(1 row)

omero=# GRANT SELECT ON ALL TABLES IN SCHEMA public TO omeroro;
GRANT
omero=# GRANT SELECT ON ALL SEQUENCES IN SCHEMA public TO omeroro;
GRANT

NOTE: This assumes no other tables are in the schema "public".

You are then able to execute SELECT but not UPDATE or INSERT statements against the database:

$ psql -h localhost -U omeroro omero
psql (9.3.2)
Type "help" for help.

omero=> SELECT * FROM dbpatch ;
 id | currentpatch | currentversion | permissions |          finished          |     message     | previouspatch | previousversion | external_id
----+--------------+----------------+-------------+----------------------------+-----------------+---------------+-----------------+-------------
  1 |            0 | OMERO5.0       |         -52 | 2015-04-03 08:58:18.790712 | Database ready. |             0 | OMERO5.0        |
(1 row)

omero=> BEGIN;
BEGIN
omero=> UPDATE dbpatch SET currentversion = 'foo' WHERE id = 1 ;
ERROR:  permission denied for relation dbpatch
omero=> INSERT INTO dbpatch VALUES (1);
ERROR:  permission denied for relation dbpatch
omero=> ROLLBACK;
ROLLBACK

This should also prevent any adverse execution of PostgreSQL functions.

Enabling in-memory Node, Session, Event support (read-only)

Coupled with the aforementioned database user privileges and a build of OMERO 5.2.x with the development branch included a "read-only" server can be achieved with the following OMERO configuration:

bin/omero config set omero.cluster.node_provider ome.security.basic.BasicInMemoryNodeProvider
bin/omero config set omero.security.event_provider ome.security.basic.BasicInMemoryEventProvider
bin/omero config set omero.sessions.session_manager ome.services.sessions.InMemorySessionManagerImpl

OMERO 5.2.x

After several discussions with the greater OME consortium it was decided that development work take place atop the current stable dev_5_2 branch of OMERO. This will allow for easy integration and testing against the current stable release of OMERO by third parties and especially by the IDR subteam. It will also allow easy integration into the currently available version of OMERO Plus. All of these constituents are utilising 5.2.x as a basis for their work.

Development branch:

  • glencoesoftware/openmicroscopy@read-only-phase1

Implemented features:

  • Implementation of InMemorySessionManagerImpl as a shim for PostgresSqlAction

  • Implementation of InMemoryNodeProvider as a shim for Node creation and OMERO.grid cluster membership

  • Implementation of InMemoryEventProvider as a shim for Event creation

  • BasicSecuritySystem and CurrentDetails extensions to better work with in memory sessions

  • Allowing for server startup

    • A read-write Server has to have started in order to have dbpatch updated with the latest Bio-Formats enumeration update and overall database session cleanup
  • Full handling of the "internal" administrative session

  • Allowing for "user" session creation

  • Allowing for the use of IQuery at a basic level

Ongoing work:

  • Addressing potential issues with the OMERO.scripts Processor and corresponding Job database interaction

OMERO 5.3.x

Decisions on targeting OMERO 5.3.x will be made at a later date.

Related reading

History

Fri 31 Mar 2017 17:13:35 BST: Updated documentation for read-only mode
Thu Feb 16 08:33:04 PST 2017: First version of a running server with IQuery possibility at a basic level
Wed Jan 25 06:02:10 PST 2017: Initial version
Thu Jan 26 05:01:30 PST 2017: Links to service routing and 5.2.x development plan

@chris-allan
Copy link
Member Author

First push of glencoesoftware/openmicroscopy@read-only-phase1 which contains the first set of functional implementations.

callan@ubuntu:~/code/ome.git$ dist/bin/omero config get
omero.cluster.node_provider=ome.security.basic.BasicInMemoryNodeProvider
omero.db.name=omero52
omero.db.pass=omeroro
omero.db.user=omeroro
omero.security.event_provider=ome.security.basic.BasicInMemoryEventProvider
omero.sessions.session_manager=ome.services.sessions.InMemorySessionManagerImpl
callan@ubuntu:~/code/ome.git$ dist/bin/omero shell --login
Previous session expired for root on localhost:4064
Server: [localhost:4064]
Username: [root]
Password:
Created session 19076866-5ba3-43b8-8bfb-e6a5f260e3a4 (root@localhost:4064). Idle timeout: 10 min. Current group: system
Python 2.7.9 (default, Apr  2 2015, 15:33:21) 
Type "copyright", "credits" or "license" for more information.

IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: s = client.getSession()

In [2]: s.getQueryService().get('Experimenter', 0L).getOmeName().getValue()
Out[2]: 'root'

In [3]: 

/cc @dpwrussell, @joshmoore, @dsudar

@dpwrussell
Copy link
Member

@chris-allan Great. I guess there isn't really anything for us to test in our domain at this point?

@chris-allan
Copy link
Member Author

@dpwrussell: Not at the moment, no. Aside from the obvious brain dead simple IQuery usage, what do you think would be a useful set of read-only operations to test with from your perspective?

@dpwrussell
Copy link
Member

@chris-allan Initially being able to do basic listings (e.g. images in dataset), then get the metadata for those objects, and then being able to get pixels for those objects. Basically exactly what you would expect if someone was running an analysis job.

@chris-allan
Copy link
Member Author

Pull request from @joshmoore attempting to get the glencoesoftware/read-only-phase1 branch building against the IDR metadata52 integration branch in ome/openmicroscopy#5213.

/cc @dpwrussell

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants