Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unset eprint_status causes stack dump #335

Open
Pfiffikus opened this issue Jun 23, 2023 · 7 comments
Open

unset eprint_status causes stack dump #335

Pfiffikus opened this issue Jun 23, 2023 · 7 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@Pfiffikus
Copy link
Contributor

If I try to export an (maybe erroneous) eprint obviously without a valid eprint_status I end up for 3.4.4 with

`------------------------------------------------------------------
----------- EPrints System Error 2023-06-23T16:28:48Z ------------

eprint_status not set

EPrints System Error inducing stack dump
at /usr/share/eprints3/bin/../perl_lib/EPrints.pm line 163.
EPrints::abort() called at /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/EPrint.pm line 662
EPrints::DataObj::EPrint::get_dataset(EPrints::DataObj::EPrint=HASH(0x5633c570a788)) called at /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj.pm line 1912
EPrints::DataObj::uri(EPrints::DataObj::EPrint=HASH(0x5633c570a788)) called at /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj.pm line 2108
EPrints::DataObj::to_sax(EPrints::DataObj::EPrint=HASH(0x5633c570a788), "Handler", EPrints::XML::SAX::PrettyPrint=HASH(0x5633c05ee6f0), "fh", "*main::STDOUT", "list", EPrints::List=HASH(0x5633bf8f56c8), "hide_volatile", ...) called at /usr/share/eprints3/perl_lib/EPrints/Plugin/Export/XML.pm line 99
EPrints::Plugin::Export::XML::output_dataobj(EPrints::Plugin::Export::XML=HASH(0x5633c5730ae0), EPrints::DataObj::EPrint=HASH(0x5633c570a788), "hide_volatile", 1, "list", EPrints::List=HASH(0x5633bf8f56c8), "fh", "*main::STDOUT", ...) called at /usr/share/eprints3/perl_lib/EPrints/Plugin/Export/XML.pm line 72
EPrints::Plugin::Export::XML::ANON(EPrints::Repository=HASH(0x5633c0ad7c30), EPrints::DataSet=HASH(0x5633c1336458), EPrints::DataObj::EPrint=HASH(0x5633c570a788), undef) called at /usr/share/eprints3/bin/../perl_lib/EPrints/List.pm line 667
EPrints::List::map(EPrints::List=HASH(0x5633bf8f56c8), CODE(0x5633c56f0c80)) called at /usr/share/eprints3/perl_lib/EPrints/Plugin/Export/XML.pm line 73
EPrints::Plugin::Export::XML::output_list(EPrints::Plugin::Export::XML=HASH(0x5633c5730ae0), "list", EPrints::List=HASH(0x5633bf8f56c8), "fh", "*main::STDOUT", "hide_volatile", 1) called at /home/eprints/Eprints/bin/export line 286`

Imho it would be helpful to get nevertheless the requested export.

@drn05r
Copy link
Contributor

drn05r commented Jul 13, 2023

I am not sure how you can get an eprint without an eprint_status set. Unless you have observed otherwise, I don't think this is a recently created issue, as eprint_status is critical to eprint records as it is required to determine which virtual dataset (e.g. inbox, buffer, archive or deletion) at eprint record also belongs.

That said, it is not unreasonable to ensure safeguards where unexpected things like this happen. I have noted in the past that a blank eprint record is added to the database initially even before default values have been added. I think how this works needs to be reviewed so this is better managed as a transaction, to eliminate the possibility of a record being initially created without an eprint_status. Beyond this, if someone writes some bespoke code (for their own repository) that deliberately unsets an eprint_status, that is on them.

@drn05r drn05r self-assigned this Jul 13, 2023
@drn05r drn05r added the bug Something isn't working label Jul 13, 2023
@drn05r drn05r added this to the 3.4.6 milestone Jul 13, 2023
@drn05r drn05r modified the milestones: 3.4.6, 3.4.x Jul 5, 2024
@drn05r
Copy link
Contributor

drn05r commented Jul 5, 2024

It has been difficult to reproduce circumstances where this happens. I think a more wider consideration is required in how to ensure records do not end of in what should be impossible states like this. That, I think, is not realistically possible for the next release (3.4.6)

@drn05r drn05r modified the milestones: 3.4.x, 3.4.7 Aug 24, 2024
@Ainmhidh
Copy link

Is it something that could be handled in eprint_fields_automatic? [untested code follows]
If( !defined( $eprint->get_value( "eprint_status" ) || $eprint->get_value( "eprint_status" ) eq "" ){ $eprint->set_value("eprint_status", "buffer" ); # or inbox or wherever makes sense? }

Not sure that feels entirely comfortable though.

@drn05r
Copy link
Contributor

drn05r commented Dec 17, 2024

As I understand the problem, EPrints has a rather antiquated way of adding new records. It initially creates a record with no values set except the eprintid and anything that has a default value according to the database. It they applies any values in eprint_fields_default. The issue seems to occur either because something causes the the process to fail before applying eprint_fields_default or during it. Therefore the most basic things like an eprint_status do not get set.

The empty records are probably not an issue in themselves, as they never get used but they cause an issue when they are accessed as part of a list. Also, this issue makes it near on if not impossible to delete these empty records.

Probably the best solution is to have something that detect an eprint record does not have a status at the point this error would be thrown and set it to a sensible default. However, I suspect there are other fields that also need to be set where a sensible default are less apparent, such as userid. However, simulating the issue is difficult and this is essential in determining whether the fix is a viable solution to the problem at hand.

@drn05r
Copy link
Contributor

drn05r commented Dec 17, 2024

@jesusbagpuss @Ainmhidh This is not a pretty fix but I think it should do the job. I have added this block to EPrint::DataObj::EPrint->get_dataset:

        # Tries to sufficiently fix empty records that were not correctly created.
        if ( ! $self->get_value( 'eprint_status' ) || ! $self->get_value( 'dir' ) )
        {
                my $data = {};
                foreach my $field ( get_system_field_info() )
                {
                        $data->{$field->{name}} = $field->{default_value} if defined $field->{default_value};
                }
                $data->{eprint_status} = 'inbox' unless $self->{data}->{eprint_status} || $data->{eprint_status};
                $data->{dir} =  $self->{session}->get_store_dir . "/" . eprintid_to_path( $self->id ) unless $self->{data}->{dir};
                foreach my $field ( keys %$data )
                {
                        $self->set_value( $field, $data->{$field}  );
                }
                if ( $self->{session}->get_database->update( $self->{dataset}, $self->{data}, $data ) )
                {
                        $status = $self->get_value( 'eprint_status' );
                }
        }

I don't think a trigger could ever work as the abort happens to early in the process.

@Ainmhidh
Copy link

I see - I misunderstood the underlying basis for the issue. Your solution is definitely more all-encompassing.

@drn05r
Copy link
Contributor

drn05r commented Dec 19, 2024

For information. What I did to replicate this problem was temporarily add a die between the insert and the update of EPrints::Database->add_record:

# atomically grab the slot in the table (key must be PRIMARY KEY!)
{
local $self->{dbh}->{PrintError};
local $self->{dbh}->{RaiseError};
if( !$self->insert( $table, [$keyname], [$id] ) )
{
Carp::carp( $DBI::errstr ) if !$self->duplicate_error;
return 0;
}
}
if( $dataset->ordered )
{
EPrints::Index::insert_ordervalues( $self->{session}, $dataset, {
$keyname => $id,
});
}
# Now add the ACTUAL data:
return $self->update( $dataset, $data, $data );

My guess would be that some part of the record update after the initial insert failed. This might have been due to the repository being under heavy load and the process being reaped before it had complicated or probably more likely something erroneous in the update that caused a terminal error from Perl or EPrints itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants