Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove non-determinism from test TestPartitionManagement #5553

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mohitbadve
Copy link

What changes were proposed in this pull request?

This PR fixes the error resulting from the flaky tests: org.apache.hadoop.hive.metastore.TestPartitionManagement.testPartitionDiscoveryTransactionalTable
The mentioned test is non-deterministic as it throws error unexpectedly.

Why are the changes needed?

The test fails because the tearDown() method attempts to delete a record (DBS entry) that is referenced by another table (TBLS), which caused a violation of foreign key constraint ‘TBLS_FK1’.

[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.hive.metastore.TestPartitionManagement
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 13.15 s <<< FAILURE! -- in org.apache.hadoop.hive.metastore.TestPartitionManagement
[ERROR] org.apache.hadoop.hive.metastore.TestPartitionManagement.testPartitionDiscoveryTransactionalTable -- Time elapsed: 13.15 s <<< ERROR!
MetaException(message:JDODataStoreException: Exception thrown flushing changes to datastore
Root cause: ERROR 23503: DELETE on table 'DBS' caused a violation of foreign key constraint 'TBLS_FK1' for key (2). The statement has been rolled back.)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$drop_database_req_result$drop_database_req_resultStandardScheme.read(ThriftHiveMetastore.java:59587)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$drop_database_req_result$drop_database_req_resultStandardScheme.read(ThriftHiveMetastore.java:59555)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$drop_database_req_result.read(ThriftHiveMetastore.java:59489)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:88)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_drop_database_req(ThriftHiveMetastore.java:1545)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.drop_database_req(ThriftHiveMetastore.java:1532)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabaseCascadePerDb(HiveMetaStoreClient.java:1870)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:1794)
at org.apache.hadoop.hive.metastore.IMetaStoreClient.dropDatabase(IMetaStoreClient.java:1870)
at org.apache.hadoop.hive.metastore.TestPartitionManagement.tearDown(TestPartitionManagement.java:101)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)....

Reproduce the test failure

Run the tests with NonDex maven plugin which is used for detecting and debugging wrong assumptions on under-determined Java APIs. The command to reproduce the flaky test failures is:

mvn -pl standalone-metastore/metastore-server edu.illinois:nondex-maven-plugin:2.1.7:nondex -Dtest=org.apache.hadoop.hive.metastore.TestPartitionManagement#testPartitionDiscoveryTransactionalTable -DnondexRuns=10

Does this PR introduce any user-facing change?

No, it just fixes the test.

Is the change a dependency upgrade?

No

Fix

Drop all tables first before dropping the databases.

How was this patch tested?

After adding the code to delete the tables as well, again NonDex was run to ensure the test passes all the times.

@@ -91,13 +91,19 @@ public void tearDown() throws Exception {
// First drop any databases in catalog
List<String> databases = client.getAllDatabases(catName);
for (String db : databases) {
for (String table : client.getAllTables(catName, db)) {
client.dropTable(catName, db, table, true, true);
}
client.dropDatabase(catName, db, true, false, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extremely sorry for the late reply, thanks for your patience. I checked and figured out that there is no issue in at least the java code related to cascade. As you pointed out, it is highly possible that the non-determinism can be because of the C scripts related to cascade. I am still checking on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants