Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement - use antlr to build out parser from grammar / lexar file #11

Open
8secz-johndpope opened this issue Jun 12, 2019 · 3 comments

Comments

@8secz-johndpope
Copy link

to stabilise parser -

I suggest rebuilding some of the code to leverage the antlr grammar / g4 files here
https://github.com/psygate/smali-antlr4-grammar

If you download this
wget https://www.antlr.org/download/antlr-4.7.2-complete.jar

you can then run

java -cp antlr-4.7.2-complete.jar org.antlr.v4.Tool -Dlanguage=Go -visitor -o gen SmaliLexer.g4
java -cp antlr-4.7.2-complete.jar org.antlr.v4.Tool -Dlanguage=Go -visitor -o gen SmaliParser.g4

this will spit out the following files / code
Screen Shot 2019-06-12 at 11 27 49 pm

https://gist.github.com/8secz-johndpope/30868ccd59f211f0000b90e6176dead7

you should then be able to walk through the smali file / maybe reducing the out of bounds crashes people (including myself) have been experiencing.

For illustration - I successfully used the grammar files to build out parsers / lexers for hundreds of languages with swift
https://github.com/johndpope/ANTLR-Swift-Target
https://github.com/johndpope/Antlr-Swift-runtime

I forget the entry point into class / it changes for each grammar

Here is the code for swift to read a java file you can find in the above repo.

  let textFileName = "Test.java"
            
            if let textFilePath = Bundle.main.path(forResource: textFileName, ofType: nil) {
                let lexer =  Java8Lexer(ANTLRFileStream(textFilePath))
                print("lexer:",lexer)
                let tokens =  CommonTokenStream(lexer)
                let parser = try Java8Parser(tokens)
               
                let tree = try parser.compilationUnit()
                print("tree:",tree)
                
                let walker = ParseTreeWalker()
                let java8walker = Java8Walker()
                try walker.walk(java8walker,tree)
                
            } else {
                print("error occur: can not open \(textFileName)")
            }

The psuedo code would be


  let textFilePath = "/path/Test.smali"
            

                let lexer =  NewSmaliLexer(ANTLRFileStream(textFilePath)) //this NewSmaliLexer exists 
                print("lexer:",lexer)
                let tokens =  CommonTokenStream(lexer) /// ?? there should be a method to do this
                let parser = try NewSmaliParser(tokens)
               
                let tree = try parser.compilationUnit() // maybe ToStringTree?
                print("tree:",tree)
                
                let walker = ParseTreeWalker() // Here as the lexer / parser reads - you can hook in to translate stuff. 
                let java8walker = Java8Walker()
                try walker.walk(java8walker,tree)
                

there are other people who have created translation using antlr to do this
https://github.com/8secz-johndpope/ObjcGrammar
you may need some help - when I have more time I will circle back.

@AlexeySoshin
Copy link
Owner

You're right, that approach would be much better, as currently I support only a very limited amount of instructions.
Will look into it.

@8secz-johndpope
Copy link
Author

vscode has smali syntax highlighting
https://github.com/ViRb3/vscode-smali/tree/master/smali
could this help?

if you surface any work in a new feature branch - I'm happy to take a look

@AlexeySoshin
Copy link
Owner

@8secz-johndpope Thanks for getting back with this issue :)
Took a look at it, but it's actually more confusing, since it's based on regexes.
Planning to make another branch for antlr this week, per your suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants