forked from Ensembl/ensj-healthcheck
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
194 lines (109 loc) · 10.9 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
EnsEMBL HealthCheck
===================
REQUIREMENTS
============
1. Java 6 JDK (v1.6 or later - see http://java.com/en/download/); this is already
installed on the Farm in /software/jdk1.6.0_14. Make sure that your
JAVA_HOME environment variable is pointing to the correct directory and that the
*correct* Java executables are in your path; put something like the following in
your .cshrc:
setenv JAVA_HOME /software/jdk1.6.0_14
setenv PATH ${JAVA_HOME}/bin:${PATH}
Note that if you get errors indicating that the java executable can't be found,
check that $JAVA_HOME is set correctly by doing
which java
and setting $JAVA_HOME to the directory in which bin/java resides.
INSTALLATION
============
1. Obtain the source files by checking out the ensj-healthcheck module from Git.
git clone https://github.com/Ensembl/ensj-healthcheck.git
Use the -r option to check out a specific tag if required.
2. cd ensj-healthcheck
3. Edit database.defaults.properties to contain values that correspond to the database
server which you want to connect to. You can also overwrite any values in the config file
on the command line.
RUNNING
=======
A number of shell scripts (with a .sh extension) are provided to aid in running
healthchecks. These are summarised below; the main one you will use is called
run-configurable-testrunner.sh; note that this script actually passes all of its
command-line options through to the ConfigurableTestRunner class.
Usage: ./run-configurable-testrunner.sh -d my_db -h my_host -g healthcheck_group
Options:
[--conf -c value...] : Name of one or many configuration files. Parameters in configuration files override each other. If a parameter is provided in more than one file, the first occurrence is used.
[--databaseURL value] : Parameter used in org.ensembl.healthcheck.testcase.EnsTestCase
[--dbtype value] : If set, this will be used as the type for all databases.
[--driver value] : Parameter used in org.ensembl.healthcheck.testcase.EnsTestCase
[--driver1 value] : Driver for server 1 (support for multiple staging servers in Ensembl)
[--driver2 value] : Driver for server 2 (support for multiple staging servers in Ensembl)
[--endSession value] : Flag to run an empty testrunnerUsed to mark the end of a parallel run
[--exclude_groups -G value...] : Specify which groups of tests should not be run. Fully qualified class names can be used as well as their short names.
[--exclude_tests -T value...] : Specify which tests should not be run. Fully qualified class names can be used as well as their short names.
[--file.separator value] : Parameter used in org.ensembl.healthcheck.testcase.EnsTestCase
[--funcgen_schema.file value] : Parameter used only in and org.ensembl.healthcheck.testcase.funcgen.CompareFuncgenSchema
[--help -h] : display help
[--host -h value] : The host for the database server you wish to connect to.
[--host1 value] : Host for server 1 (support for multiple staging servers in Ensembl)
[--host2 value] : Host for server 2 (support for multiple staging servers in Ensembl)
[--ignore.previous.checks value] : Parameter used only in org.ensembl.healthcheck.testcase.generic.ComparePreviousVersionExonCoords, org.ensembl.healthcheck.testcase.generic.ComparePreviousVersionBase and org.ensembl.healthcheck.testcase.generic.GeneStatus
[--include_groups -g value...] : Specify which groups of tests should be run. Fully qualified class names can be used as well as their short names.
[--include_tests -t value...] : Specify which tests should be run. Fully qualified class names can be used as well as their short names.
[--master.funcgen_schema value] : Parameter used in org.ensembl.healthcheck.testcase.funcgen.CompareFuncgenSchema
[--master.schema value] : Parameter used only in org.ensembl.healthcheck.testcase.generic.CompareSchema, and org.ensembl.healthcheck.testcase.funcgen.CompareFuncgenSchema
[--master.variation_schema value] : Parameter used only in master.variation_schema
[--output -o value] : Specify the level of output that will be used. The allowed options are "All", "None", "Problem", "Current", "Warning" and "Info", .
[--output.database value] : The name of the database where the results of the healthchecks are written to, if the database reporter is used.
[--output.driver value] : The driver for the database where the results of the healthchecks are written to, if the database reporter is used.
[--output.host value] : The name of the database where the results of the healthchecks are written to, if the database reporter is used.
[--output.password value] : The password for the database where the results of the healthchecks are written to, if the database reporter is used.
[--output.port value] : The port of the database where the results of the healthchecks are written to, if the database reporter is used.
[--output.release value] : Gets written into the session table for describing the test session, if the database reporter is used.
[--output.schemafile value] : If output.database does not exist, it will be created automatically. This file should have the SQL commands to create the schema. Please remember that hashes (#) are not allowed to start comments in SQL. Use two dashes "--" at the beginning of a line instead. If the configuratble testrunner can't find this file from the current working directory, it will search for it in the classpath.
[--output.user value] : The user name for the database where the results of the healthchecks are written to, if the database reporter is used.
[--password value] : Parameter used in org.ensembl.healthcheck.testcase.EnsTestCase
[--password1 value] : Password for server 1 (support for multiple staging servers in Ensembl)
[--password2 value] : Password for server 2 (support for multiple staging servers in Ensembl)
[--perl value] : Parameter used only in org.ensembl.healthcheck.testcase.AbstractPerlBasedTestCase
[--port -P value] : The port for the database server you wish to connect to.
[--port1 value] : Port for server 1 (support for multiple staging servers in Ensembl)
[--port2 value] : Port for server 2 (support for multiple staging servers in Ensembl)
[--production.database value] : The name of the Ensembl production database to use to retrieve division information. Assumed to be on the same server as the output databases.
[--compara_master.database value] : The name of the Ensembl Compara master database to use to control the content of the tested Compara database. Assumed to be on one of the configured servers.
[--repair value] : Allow the tests to try to repair the database (if they can)
[--reporterType -R value] : Specify the reporter type that will be used. The allowed options are "Database" and "Text".
[--schema.file value] : Parameter used only in org.ensembl.healthcheck.testcase.generic.CompareSchema,
[--secondary.database value] : Some tests require a second database containing the previous release. This configures the database name for the second database server.
[--secondary.driver value] : Some tests require a second database containing the previous release. This configures the driver for the second database server.
[--secondary.host value] : Some tests require a second database containing the previous release. This configures the hostname of the second database server.
[--secondary.password value] : Some tests require a second database containing the previous release. This configures the password for the second database server.
[--secondary.port value] : Some tests require a second database containing the previous release. This configures the port of the second database server.
[--secondary.user value] : Some tests require a second database containing the previous release. This configures the user name for the second database server.
[--sessionID value] : The session to add these results forUsed in parallel run
[--species value] : If set, this will be used as the species for all databases, overriding anything thename or meta table of the database may indicate.
[--testRegistryType -r value] : Specify the type of test registry that will be used. The allowed options are "Discoverybased" and "ConfigurationBased"
[--test_databases -d value...] : Name of databases that should be tested (e.g.: ensembl_compara_bacteria_5_58). If there is more than one database, separate with spaces. Any configured tests will be run on these databases. Does not support same format as output.databases!
[--test_divisions -D value...] : Names of division to which databases to test should belong e.g. EPl or EnsemblPlants. This option requires the production database to be set up.
[--user value] : Parameter used in org.ensembl.healthcheck.testcase.EnsTestCase
[--user.dir value] : Parameter used in org.ensembl.healthcheck.testcase.EnsTestCase
[--user1 value] : User for server 1 (support for multiple staging servers in Ensembl)
[--user2 value] : User for server 2 (support for multiple staging servers in Ensembl)
[--variation_schema.file value] : Parameter used only in org.ensembl.healthcheck.testcase.variation.CompareVariationSchema
Here is an example commandline :
./run-configurable-testrunner.sh -d homo_sapiens_core_80_38 --dbtype core --species homo_sapiens --output problem -g PostGenebuild
Test Groups
-----------
It is possible to run a single test or a group of tests.
Groups of tests are defined in src/org/ensembl/healthcheck/testgroup
and represent sets of healthchecks that are usually run together
Other Utilities
---------------
Run each of these with the -h option to show usage.
database-name-matcher.sh
Shows which database names match a particular regular expression.
compile-healthcheck.sh
Only used if you've made changes to the source, e.g. when writing your own
tests.
WRITING YOUR OWN TESTS
======================
If you want to write your own healthchecks, rather than running the pre-defined
ones, see the file README-writing-tests.txt.