Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Gromacs input parameter parsing further #129

Open
JFRudzinski opened this issue Sep 13, 2024 · 0 comments
Open

Improve Gromacs input parameter parsing further #129

JFRudzinski opened this issue Sep 13, 2024 · 0 comments

Comments

@JFRudzinski
Copy link
Collaborator

in #122 we already made some serious improvements to str_to_input_parameters(), but there are still some issues:

  1. The approach may be able to be improved:

@ladinesa suggested:

re_section = re.compile(r'(?P<indent> +?)(?P<key>\S+?) +\(*(?P<index>\d*)\)*\:')
re_value = re.compile(r'(?P<indent> +?)(?P<key>\S+?) *[:=]+ *(?P<value>[^\n]+)')

converters = [
    (re.compile(r'([-+]?\d+)'), lambda x: int(x.group(1))),
    (re.compile(r'(?i:(?:true|false))'), lambda x: x.group(1).lower() == 'true'),
    (re.compile(r'([-+]?\d+\.*\d*(?:[Ee][-+]\d+)?)'), lambda x: float(x.group(1))),
    (re.compile(r'\{(\d+),\.\.\.,(\d+)\}'), lambda x: list(range(int(x.group(1)), int(x.group(2))))),
    (re.compile(r'\{*([\d\.]+,.+)\}*'), lambda x: [float(v) for v in x.group(1).split(',')])
]

def convert_value(value):
    value = value.strip()
    for pattern, converter in converters:
        match = pattern.match(value)
        if match:
            return converter(match)
    if value.lower() == 'not available':
        return None
    return value

def parse_indented(block):
    root = {}
    stack = [(root, -1)]

    for line in block.splitlines():
        match = re_section.match(line)
        if not match:
            match = re_value.match(line)
        if not match:
            continue

        match_dct = match.groupdict()
        indent = len(match_dct.get('indent'))
        key = match_dct.get('key')
        value = match_dct.get('value')

        while stack and stack[-1][1] >= indent:
            stack.pop()

        parent = stack[-1][0]

        child = convert_value(value )if value else {}
        if key in parent:
            if isinstance(parent[key], list):
                parent[key].append(child)
            else:
                parent[key] = [parent[key], child]
        else:
            parent[key] = child

        stack.append((child, indent))
    return root

but I was not yet able to get this working completely.

  1. There are some special cases, e.g., all-lambas where the indentation method fails due to length of the keyword. These exceptions need to be dealt with specially.
@JFRudzinski JFRudzinski added backlog Issue not currently being worked on, and no plans for efforts in the near future and removed backlog Issue not currently being worked on, and no plans for efforts in the near future labels Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant