How to find parsing error with ParseKit framework

Go To


I was wondering if there were a way to get back how far into an assembly a PKParser has parsed before encountering a syntax error.


I'm using a grammar that basically describes a prefix notation expression language.

For example:

given your standard prefix notation expression grammar and a string "(+ a - b c))" I'd like to retrieve that [(,+,a] where matched, so I can give the user some idea of where to look to fix their error, but the completeMatchFor and bestMatchFor don't return anything I can use to find this info.

Ideally I'd like to say that a '(' was expected, but it's not necessary for a grammar as simple as what I'm using.

From the book mentioned as the user manual, it seemed as if I would need to create a custom parser for this, but I was hoping that maybe I'd simply missed something in the framework.


2012-04-03 23:24
by user1311583


Developer of ParseKit here.

There are two features in ParseKit which can be used to help provide user-readable hints describing parse errors encountered in input.

  1. -[PKParser bestMatchFor:]
  2. The PKTrack class

It sounds like you're aware of the -bestMatchFor: method even if it's not doing what you expect in this case.

I think the PKTrack class will be more helpful here. As described in Metsker's book, PKTrack is exactly like PKSequence except that its subparsers are required, and an error is thrown (with a helpful error message) when all of its subparsers are not matched.

So here's a grammar for your example input:

@start         = '(' expr ')' | expr;
expr           = ('+' | '-') term term;
term           = '(' expr ')' | Word;

Any productions listed contiguously are a Sequence -- but could instead be a Track.

The benefit of changing these Sequences to be Tracks is that an NSException will be thrown with a human-readable parse error message if the input doesn't match. The downside is that you must now wrap all usages of your factory-generated parser in a try/catch block to catch these Track exceptions.

The problem currently (or before now, at least) is that the PKParserFactory never produced a parser using Tracks. Instead, it would always use Sequences.

So I've just added a new option in head of trunk at Google Code (you'll need to udpate).

#define USE_TRACK 0



It's 0 by default. If you change this define to 1, Tracks will be used instead of Sequences. So given the grammar above and invalid input like this:

(+ a - b c))

and this client code:

NSString *g = // fetch grammar above
PKParser *p = [[PKParserFactory factory] parserFromGrammar:g assembler:self];
NSString *s = @"(+ a - b c))";

@try {
    PKAssembly *res = [p parse:s];
    NSLog(@"res %@", res);
@catch (NSException *exception) {
    NSLog(@"Parse Error:%@", exception);

you will get a nice-ish human-readable error:

Parse Error:

After : ( + a
Expected : Alternation (term)
Found : -

Hope that helps.

2012-04-04 05:51
by Todd Ditchendorf


I'm wrestling with this issue too. In order for -bestMatchFor: to be useful in identifying error conditions, there should be methods in PKAssembly's public interface indicating if there are more tokens/characters to be parsed. -completeMatchFor: is able to determine error state because it has access to the private -hasMore method. Perhaps PKAssembly's -hasMore method should be public.

I looked at PKTrack but since I want to handle errors programmatically, it wasn't useful to me.

My conclusion is I either write my own custom Track parser or I alter the framework and expose -hasMore. Are there other ways to handle errors?

Until I figure out a better way to detect errors, I've added the following to the file containing the implementation of my custom parser:

@interface PKAssembly ()
- (BOOL)hasMore;
- (id)peek;

@implementation PMParser

In my parse method:

PKAssembly*     a     = [PKTokenAssembly assemblyWithString:s];
PKAssembly*     best  = [self bestMatchFor:a];
PMParseNode*    node  = nil;
BOOL            error = NO;
NSUInteger      errorOffset = 0;

if (best == nil)  // Anything recognized?
    error = YES;
    if ([best hasMore])  // Partial recognition?
        PKToken*    t = [best peek];

        error       = YES;
        errorOffset = t.offset;

    node = [best pop];

If an error occurred, errorOffset will contained the location of the unrecognized token.

2012-07-07 13:23
by Edward