-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Markup for non-vernacular words #49
Comments
Commenting on my own suggestion, I realise that changing the font or hyphenation based on something that comes after the text is very hard in at least PTXprint. I don't know about other typesetting engines. Example:
Also, a ranged milestone would allow the entirety of a majority language introduction to be marked up.
|
Assuming we allow adding category markup to paragraph and character markers, this could be implemented simply by putting a category
Could we add category information to the Paratext Style sheets? For example in custom.sty:
|
The problem with this approach in a stylesheet is that you have, in effect,
multiple records with the same key. That is a significant change for the
tooling. It makes specifying the structure of stylesheets way more
complicated. PTXprint gets around this using a structured Marker that is
not valid USFM. See the technical manual for details.
I agree that a category value should be constrained to the normal id
characters of lowercase, digits, hyphen or underscore. And yes I can buy
into the value being a space separated list of category values.
…On Mon, 19 Feb 2024, 19:16 Kent Spielmann, ***@***.***> wrote:
Assuming we allow adding category markup to paragraph and character
markers, this could be implemented simply by putting a category \cat
ro\cat* on a Paragraph or a Character span. It would not be pretty in
Paratext but could be useful in typesetting and other publishing processes.
Could we add category information to the Paratext Style sheets?
\marker p
\cat ro
\TextProperties paragraph publishable nonvernacular
\Italic
—
Reply to this email directly, view it on GitHub
<#49 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLMO3MCJ5RYGGVMFEKDMILYUOQJPAVCNFSM6AAAAAA6Y35UWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJTGA2DKMZVGQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
How about PTXprint (or whatever) processes a |
I agree that using My feeling, however, is that it would be a mistake to conflate |
Good point. Perhaps we should us The problem is that \wh and \wg are defined in terms of individual words in a wordlist rather than simply text in another language. Could we reappropriate \wh and \wg to simply be text in another language, marked as such, to stop the word analyser trying to allocate the text into the wrong place. Given the word analyser can break strings into words, there is no reason that \wh and \wg (and so \wl) need mark individual words separately. Either that or I am misinterpretting the standard. It would help to have some examples in the docs. |
A quick look through some projects (what's in the DBL) \wg and \wh are rarely if ever used. Might we then deprecate it in favour of \wl text|grc\wl* or at least make them synonyms. and \wh = \wl text|hbo\wl* |
[Moved here from old site]
While there is
\tl
that is for transliterated words intended to be pronounceable in the vernacular orthography.I would like to propose that there also be a
\ol
for "other language", not written in the vernacular orthography. I briefly considered calling it\wf
(word foreign), but my use-case assumption is that at least some readers know the language, and may not consider it as foreign, but it's not the vernacular language of the publication.It might be in the majority language of the region, a trade language, an international language, or that of a neighbouring area or group.
Summary
Description
Other language (non-vernacalar) text, written in unaltered form, often one known and understood by at least a fraction of the target audience.
Notes
lang
, which specifies the source language according to ISO639-1 (2 letter codes) or 639-3 (3 letter codes historically known as ethnologue codes).\tl
), it is instead given in a form that readers of the language find it easiest to understand.Syntax
\ol
content\ol*
\ol
content|lang="
code" \ol*
<char style="ol" lang="code"> content</char>
Style type
Character
Valid in
[Section] [Para] [Table] [List] [Footnotes]
Example
The text was updated successfully, but these errors were encountered: