Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marc-XML-encoder: record-type written as controlfield not as attribut of record-field #402

Closed
TobiasNx opened this issue Oct 4, 2021 · 16 comments · Fixed by #405
Closed
Assignees

Comments

@TobiasNx
Copy link
Contributor

TobiasNx commented Oct 4, 2021

While reviewing #336 I saw that the optional attribute type of the record field is also wrongly outputted as controlledfield:

in:

<?xml version="1.0" encoding="UTF-8"?>
  <record xmlns="http://www.loc.gov/MARC21/slim" type="Bibliographic">
    <leader>00000pam a2200000 c 4500</leader>
    <controlfield tag="001">1196308691</controlfield>
    <controlfield tag="003">DE-101</controlfield>
    <controlfield tag="005">20210312081326.0</controlfield>
    <controlfield tag="007">tu</controlfield>
    <controlfield tag="008">191003s2019    gw ||||| |||| 00||||ger  </controlfield>
    ...

out:

<marc:record>
		<marc:controlfield tag="type">Bibliographic</marc:controlfield>
		<marc:leader>00000pam a2200000 c 4500</marc:leader>
		<marc:controlfield tag="001">1196308691</marc:controlfield>
		<marc:controlfield tag="003">DE-101</marc:controlfield>
		<marc:controlfield tag="005">20210312081326.0</marc:controlfield>
		<marc:controlfield tag="007">tu</marc:controlfield>
		<marc:controlfield tag="008">191003s2019    gw ||||| |||| 00||||ger</marc:controlfield>

Never mind the namespaces in this ticket.

@blackwinter
Copy link
Member

blackwinter commented Oct 4, 2021

This was addressed by #394: xmlHandler.setAttributeMarker("~") or handle-*-xml(attributeMarker="~") should do the trick (see also #379).

@blackwinter
Copy link
Member

Oh, I see, the MARC XML encoder doesn't know how to handle (marked) attributes. Is that it?

@dr0i
Copy link
Member

dr0i commented Oct 4, 2021

Yep, good catch, that's it!

@TobiasNx
Copy link
Contributor Author

TobiasNx commented Oct 4, 2021

Oh, I see, the MARC XML encoder doesn't know how to handle (marked) attributes. Is that it?

type= is an optional attribute of record(See: https://www.loc.gov/standards/marcxml/xml/spy/spy.html#element_record_Link044CBAB8). The encoder passes it as controlled field:

<record xmlns="http://www.loc.gov/MARC21/slim" type="Bibliographic">

=>

<marc:record>
<marc:controlfield tag="type">Bibliographic</marc:controlfield>

type is set as tag in controlfield tag and the attribute value is set in the bracket.

Using the attribute marker the output is:

	<marc:record>
		<marc:controlfield tag="~type">Bibliographic</marc:controlfield>

Should be:
`<marc:record type="Bibliographic" >
I think.

Neither marked nor not marked works.

@TobiasNx
Copy link
Contributor Author

TobiasNx commented Oct 4, 2021

I have a testcase here:

https://github.com/TobiasNx/notWorkingFlux/tree/main/leader
It was originally intended for the leader-problem. But since this is fixed it shows the other problem too:

Testflux: https://github.com/TobiasNx/notWorkingFlux/blob/main/leader/misplacedLeader.flux

Testfile: https://github.com/TobiasNx/notWorkingFlux/blob/main/leader/1196308691_marcxml.mrcx

@TobiasNx TobiasNx changed the title Marc-XML: record-type written as controlfield not as attribut of record-field Marc-XML-encoder: record-type written as controlfield not as attribut of record-field Oct 4, 2021
@blackwinter
Copy link
Member

Should be fixed by #404. Could you check?

@TobiasNx
Copy link
Contributor Author

TobiasNx commented Oct 5, 2021

if the attributeMarker is explicitly set like this

handle-marcxml(attributeMarker="~")|
morph(FLUX_DIR + "allNested.xml")|
encode-marcxml(attributeMarker="~")|

result is:
<marc:record type="Bibliographic">

it works, if not the option is not set explicitly it doesn't.

Is there anyway that we do not need this explicit marking. Since all other attributes of marcXML e.g. code= or ind1=:

		<marc:datafield tag="016" ind1="7" ind2=" ">
			<marc:subfield code="2">DE-101</marc:subfield>
			<marc:subfield code="a">1196308691</marc:subfield>
		</marc:datafield>

are recognized directly and ecoded properly. In this it differs from a generic XML decoder/encoder since the possible attributes are limited.

Is this what you mean by dirty vs clean solution @blackwinter ?

@blackwinter
Copy link
Member

Is this what you mean by dirty vs clean solution @blackwinter ?

No, this is just for backward compatibility. Although, if we consider it a bugfix, we can justify changing the default behaviour. Thoughts?

@TobiasNx
Copy link
Contributor Author

TobiasNx commented Oct 5, 2021

But then the new leader behaviour is also not backwards compatible. Isn't it.

@blackwinter
Copy link
Member

But then the new leader behaviour is also not backwards compatible. Isn't it.

Exactly. This was a bugfix.

@blackwinter
Copy link
Member

blackwinter commented Oct 5, 2021

And to be clear: I'm sure we can make a case for this to be a bug, too. I was probably still too much in the mindset of the previous attribute work ;)

@blackwinter
Copy link
Member

Proposing #405 as bugfix. WDYT?

@TobiasNx
Copy link
Contributor Author

TobiasNx commented Oct 7, 2021

Nice. Bugfix seems fine. +1.


input:

  <record xmlns="http://www.loc.gov/MARC21/slim" type="Bibliographic">
    <leader>00000pam a2200000 c 4500</leader>
    <controlfield tag="001">1196308691</controlfield>

was:

	<marc:record>
		<marc:controlfield tag="type">Bibliographic</marc:controlfield>
		<marc:leader>00000pam a2200000 c 4500</marc:leader>

now:

	<marc:record type="Bibliographic">
		<marc:leader>00000pam a2200000 c 4500</marc:leader>

Also don't be confused about the namespace refrences in record at input. encode-marcxml always outputs collections and the namespace refrence is in the collection-tag e.g. <marc:collection xmlns:marc="http://www.loc.gov/MARC21/slim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">. If this should be an feature to choose output at record or at collection level, we should open another ticket.

Also the namespace-prefixes in the output are okay since encode-marcxml always does it like this. see #403 .

@TobiasNx TobiasNx assigned dr0i and unassigned TobiasNx Oct 7, 2021
@dr0i
Copy link
Member

dr0i commented Oct 7, 2021

So I think we agree on having this fixed :)
Closing.

@dr0i dr0i closed this as completed Oct 7, 2021
@blackwinter
Copy link
Member

So I think we agree on having this fixed :)

But the pull request isn't merged, yet?

@dr0i
Copy link
Member

dr0i commented Oct 7, 2021

Sorry, thought I had, but forgot to push. Done now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants