-
Hi, I discovered this project a few days ago, and I've been playing with it in my spare time, but I encountered what I think might be a bug. I write "think" because I'm pretty new to this field, so I might be doing something obviously wrong (in which case I apologize in advance for wasting your time :p) Here is a really simple test: const std::string tests [] = {
"1.2", ".2", "1.", "1",
"1.2e-10", ".2e-10", "1.e-10", "1e-10",
"1.2e+10", ".2e+10", "1.e+10", "1e+10",
"1.2e10", ".2e10", "1.e10", "1e10",
"[1.2]", "[1.]", "[.2]", "[1]"
};
namespace grammar
{
using namespace tao::pegtl;
using namespace tao::pegtl::ascii;
// exponent : [Ee][+-]?\d+
struct _expo : seq< one< 'e', 'E' >, opt< one< '+', '-' > >, plus< digit > > {};
// number : \d+{_expo}? | \d*\.\d+{_expo}? | \d+\.\d*{_expo}?
struct _real : sor<
seq< plus< digit >, opt< _expo > >,
seq< star< digit >, one< '.' >, plus< digit >, opt< _expo > >,
seq< plus< digit >, one< '.' >, star< digit >, opt< _expo > >
> {};
// test : {_real} | [{_real}]
struct _expression : sor<
_real,
seq< one< '[' >, _real, one< ']' > >
> {};
}
int main(int argc, char ** argv)
{
using tao::pegtl::parse;
using tao::pegtl::string_input;
for (const auto & test : tests)
{
bool result = parse< grammar::_expression >(string_input< >(test, test));
std::cout << test.c_str() << " : " << (result ? "ok" : "not matched") << std::endl;
}
return 0;
} This is simplified, but basically it should match either a number or a number enclosed in square brackets. All the first tests are matched correctly (the numbers without brackets) but it fails to match numbers inside squared brackets in some cases. Here is the output:
I tried using the |
Beta Was this translation helpful? Give feedback.
Replies: 11 comments
-
There are two problems: You don't have a grammar that matches the whole input and you have the wrong order of choices in To solve it, use
to make sure you always match the complete input and use:
to always get the longer matches first. Note that this is one of the key differences between a PEG and a CFG. Feel free to re-open this issue if this doesn't solve your problem. |
Beta Was this translation helpful? Give feedback.
-
Hi @d-frey , Thanks a lot for taking the time to answer, that was fast ! It solves my problem (the order of the While I'm at it, is there any place in the documentation where it's made clear what exactly are the roles / differences of the 2 constructors' parameters of |
Beta Was this translation helpful? Give feedback.
-
Indeed there is. Inputs-And-Parsing in the doc folder explains everything. The first paragraph of that page explains "source" for all of the input methods. To paraphrase: "data" is the input your are parsing, "source" is where that input comes from. For a file parser, "source" would be the name of the file, for example. |
Beta Was this translation helpful? Give feedback.
-
Yes, |
Beta Was this translation helpful? Give feedback.
-
Haha ok, we're just not aligned on the definition of 'clear' :) I had read this page (multiple times even) but the construction of Anyways, thanks for the explanations ! |
Beta Was this translation helpful? Give feedback.
-
Hi, |
Beta Was this translation helpful? Give feedback.
-
Hello, I don't want to create a new issue, since I had exactly the same question than dcourtois and you solved it. I was not able to understand what I had to put for data in the constructor of string_input. Whith your explanation, it is clear. In the documentation, I think you should rephrase
to something like
I just have one additionnal remark, when you said
What do you mean by that? I if try to create a |
Beta Was this translation helpful? Give feedback.
-
Thanks, I'll reopen this issue as a reminder to look at the documentation again. |
Beta Was this translation helpful? Give feedback.
-
The defaulted part is not correct (anymore). |
Beta Was this translation helpful? Give feedback.
-
The role of the |
Beta Was this translation helpful? Give feedback.
-
In my opinion, it is still really unclear what you should put (based on the documentation, not this discussion) if your source comes from memory or a string. You may consider adding some examples. |
Beta Was this translation helpful? Give feedback.
There are two problems: You don't have a grammar that matches the whole input and you have the wrong order of choices in
_real
. The second problem leads toseq<A,B>
to not even try to matchB
when is is able to matchA
. In your case,seq< plus< digit >, opt< _expo > >
matches the1
of the input1.2
. So it doesn't even look at the other choices. Since your grammar does not requireeof
at the end, this does match - but not the whole input. When you have[1.2]
as an input, it matches[
, then matches1
for_real
and then expects]
, but gets.
- hence it fails on this input.To solve it, use
to mak…