Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not all of the non-English text is German #11

Open
Conal-Tuohy opened this issue Dec 2, 2016 · 9 comments
Open

Not all of the non-English text is German #11

Conal-Tuohy opened this issue Dec 2, 2016 · 9 comments
Assignees

Comments

@Conal-Tuohy
Copy link
Owner

e.g. http://vmcp.conaltuohy.com/xtf/view?docId=tei/1890-6/1891/91-00-00t.xml;chunk.id=main;toc.depth=1;toc.id=;brand=default is in French

@Conal-Tuohy Conal-Tuohy changed the title Not all of the non-German text is English Not all of the non-English text is German Dec 10, 2016
@Conal-Tuohy
Copy link
Owner Author

What other languages are present in the corpus? Just French, German, and English?

@LucasHorseshoeBend
Copy link
Collaborator

There is some Russian and some Spanish.
I have not looked at the corpus to try to find examples. I think theremight also be some Italian, and possible Portuguese, in certificates of election to learned societies in those countries.

Ler me now if you need me to identify somme examples.

@Conal-Tuohy
Copy link
Owner Author

Thanks! An example of each would be handy

@Conal-Tuohy Conal-Tuohy self-assigned this Dec 15, 2016
@Conal-Tuohy
Copy link
Owner Author

Great! These examples will be very helpful for automating the language classification.

By the way, if you come across any more, please add an additional comment. If this issue is closed in the meantime, you can re-open it or create a new one.

@LucasHorseshoeBend
Copy link
Collaborator

Another language:
I came across a letter in Hungarian from Mueller today. (Probably originally in English, but we only have a translation in a Hungarian botany journal). It has not yet been transcribed, but when it is it will need to be detectable.

@LucasHorseshoeBend
Copy link
Collaborator

Sorry, hit the wrong button!

@LucasHorseshoeBend
Copy link
Collaborator

The letter in Hungarian is at
http://vmcp.conaltuohy.com/xtf/view?docId=tei/Mueller letters/1870-9/1879/79-03-25-final.xml

@Conal-Tuohy Conal-Tuohy pinned this issue Aug 23, 2022
@LucasHorseshoeBend
Copy link
Collaborator

This is not urgent and would be a luxury refinement.
I have suggested a way around this in an e-mail yesterday of things picked up in the design of the layout of pages, and just removing the label German on Languages in the letter view page will do for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants