Publications

2015

  • Yannis Smaragdakis, George Balatsouras, George Kastrinis, Martin Bravenboer. More Sound Static Handling of Java Reflection. APLAS 2015.
  • Dung Nguyen, Molham Aref, Martin Bravenboer, George Kollias, Hung Q. Ngo, Christopher Re, Atri Rudra. Join Processing for Graph Patterns: An Old Dog with New Tricks. GRADES 2015.

2011

  • Yannis Smaragdakis, Martin Bravenboer, and Ondrek Lhotak. Pick Your Contexts Well: Understanding Object-sensitivity. POPL 2011.

2010

  • William Marczak, Shan Shan Huang, Martin Bravenboer, Micah Sherr, Boon Thau Loo, and Molham Aref. SecureBlox: Customizable Secure Distributed Data Processing. In Proceedings of the 35th SIGMOD International Conference on Management of Data (SIGMOD 2010), June 2010.

2009

  • Martin Bravenboer, Yannis Smaragdakis. Strictly Declarative Specification of Sophisticated Points-to Analyses. In Proceedings of the 24th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2009), October 2009. To appear.

    a.k.a. Doop

    We present the Doop framework for points-to analysis of Java programs. Doop builds on the idea of specifying pointer analysis algorithms declaratively, using Datalog: a logic-based language for defining (recursive) relations. We carry the declarative approach further than past work by describing the full end-to-end analysis in Datalog and optimizing aggressively using a novel technique that takes into account Datalog incremental evaluation.

    As a result, Doop achieves several benefits, including stunning (full order-of-magnitude) improvements in runtime. We compare Doop with Lhotak and Hendren's Paddle, which defines the state of the art for context-sensitive analyses. For the exact same logical points-to definitions (and, consequently, identical precision) Doop is more than 15x faster than Paddle for a 1-call-site sensitive analysis of the DaCapo benchmarks, with lower but still substantial speedups for other important analyses. Additionally, Doop scales to very precise analyses that are impossible with Paddle and Whaley et al.'s bddbddb, directly addressing open problems in past literature. Finally, our implementation is modular and can be easily configured to analyses with a wide range of characteristics, largely due to its declarativeness.

  • Martin Bravenboer, Yannis Smaragdakis. Exception Analysis and Points-to Analysis: Better Together. In International Symposium on Software Testing and Analysis (ISSTA 2009), July 2009. To appear.

    Exception analysis and points-to analysis are typically done in complete separation. Past algorithms for precise exception analysis (e.g., pairing throw clauses with catch statements) use pre-computed points-to information. Past points-to analyses either unsoundly ignore exceptions, or conservatively compute a crude approximation of exception throwing (e.g., considering an exception throw as an assignment to a global variable, accessible from any catch clause).

    We show that this separation results in significant slowdowns or vast imprecision. The two kinds of analyses are interdependent: neither can be performed accurately without the other. The interdependency leads us to propose a joint handling for performance and precision. We show that our exception analysis is expressible highly elegantly in a declarative form, and can apply to points-to analyses of varying precision. In fact, our specification of exception analysis is ``fully precise'', as it models closely the Java exception handling semantics. The necessary approximation is provided only through whichever abstractions are used for contexts and objects in the base points-to analysis.

    Our combined approach achieves similar precision relative to exceptions (exception-catch links) as the best past precise exception analysis, with a runtime of seconds instead of tens of minutes. At the same time, our analysis achieves much higher precision of points-to information (an average of half as many values for each reachable variable for most of the DaCapo benchmarks) than points-to analyses that treat exceptions conservatively, all at a fraction of the execution time.

  • Martin Bravenboer, Eelco Dolstra, and Eelco Visser. Preventing Injection Attacks with Syntax Embeddings. In Science of Computer Programming, 2009. To appear.

    a.k.a. StringBorg

    Software written in one language often needs to construct sentences in another language, such as SQL queries, XML output, or shell command invocations. This is almost always done using unhygienic string manipulation, the concatenation of constants and client-supplied strings. A client can then supply specially crafted input that causes the constructed sentence to be interpreted in an unintended way, leading to an injection attack.

    We describe a more natural style of programming that yields code that is impervious to injections by construction. Our approach embeds the grammars of the guest languages (e.g. SQL) into that of the host language (e.g. Java) and automatically generates code that maps the embedded language to constructs in the host language that reconstruct the embedded sentences, adding escaping functions where appropriate. This approach is generic, meaning that it can be applied with relative ease to any combination of host and guest languages.

2008

  • Lennart Kats, Martin Bravenboer, and Eelco Visser. Mixing Source and Bytecode - A Case for Compilation by Normalization. In Proceedings of the 23st ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2008), October 2008.

    Language extensions increase programmer productivity by providing concise, often domain-specific syntax, and support for static verification of correctness, security, and style constraints. Language extensions can often be realized through translation to the base language, supported by preprocessors and extensible compilers. However, various kinds of extensions require further adaptation of a base compiler's internal stages and components, for example to support separate compilation or to make use of low-level primitives of the platform (e.g., jump instructions or unbalanced synchronization). To allow for a more loosely coupled approach, we propose an open compiler model based on normalization steps from a high-level language to a subset of it, the core language. We developed such a compiler for a mixed Java and (core) bytecode language, and evaluate its effectiveness for composition mechanisms such as traits, as well as statement-level and expression-level language extensions.

  • Martin Bravenboer and Eelco Visser. Parse Table Composition - Separate Compilation and Binary Extensibility of Grammars. In Proceedings of 1st International Conference on Software Language Engineering (SLE 2008) September 2008

    Module systems, separate compilation, deployment of binary components, and dynamic linking have enjoyed wide acceptance in programming languages and systems. In contrast, the syntax of languages is usually defined in a non-modular way, cannot be compiled separately, cannot easily be combined with the syntax of other languages, and cannot be deployed as a component for later composition. Grammar formalisms that do support modules use whole program compilation.

    Current extensible compilers focus on source-level extensibility, which requires users to compile the compiler with a specific configuration of extensions. A compound parser needs to be generated for every combination of extensions. The generation of parse tables is expensive, which is a particular problem when the composition configuration is not fixed to enable users to choose language extensions.

    In this paper we introduce an algorithm for parse table composition to support separate compilation of grammars to parse table components. Parse table components can be composed (linked) efficiently at runtime, i.e. just before parsing. While the worst-case time complexity of parse table composition is exponential (like the complexity of parse table generation itself), for realistic language combination scenarios involving grammars for real languages, our parse table composition algorithm is an order of magnitude faster than computation of the parse table for the combined grammars.

  • Martin Bravenboer and Eelco Visser. Designing Syntax Embeddings and Assimilations for Language Libraries. In Models in Software Engineering: Workshops and Symposia at MoDELS 2007, volume 5002 of LNCS, 2008.

    Language libraries extend regular libraries with domain-specific notation. More precisely, a language library is a combination of a domain-specific language embedded in the general-purpose host language, a regular library implementing the underlying functionality, and an assimilation transformation that maps embedded DSL fragments to host language code. While the basic architecture for realizing language libraries is the same for all applications, there are many design choices to be made in the design of a particular combination of library, guest language syntax, host language, and assimilation. In this paper, we present a systematic analysis of the design space for syntax embeddings and assimilations for the realization of language libraries. The contribution of this paper is an overview of the state-of-the-art providing insight in the design space and research questions in language library realization, in particular, the identification of research issues for realizing an independently extensible language library framework.

  • Martin Bravenboer, Karl Trygve Kalleberg, Rob Vermaas, and Eelco Visser. Stratego/XT 0.17. A Language and Toolset for Program Transformation. In Science of Computer Programming, June 2008

    Stratego/XT is a language and toolset for program transformation. The Stratego language provides rewrite rules for expressing basic transformations, programmable rewriting strategies for controlling the application of rules, concrete syntax for expressing the patterns of rules in the syntax of the object language, and dynamic rewrite rules for expressing context-sensitive transformations, thus supporting the development of transformation components at a high level of abstraction. The XT toolset offers a collection of flexible, reusable transformation components, and tools for generating such components from declarative specifications. Complete program transformation systems are composed from these components.

    This paper gives an overview of Stratego/XT 0.17, including a description of the Stratego language and XT transformation tools; a discussion of the implementation techniques and software engineering process; and a description of applications built with Stratego/XT.

  • Martin Bravenboer. Exercises in Free Syntax. Syntax Definition, Parsing, and Assimilation of Language Conglomerates. PhD thesis, Utrecht University, January 2008

    In modern software development the use of multiple software languages to constitute a single application is ubiquitous. Despite the omnipresent use of combinations of languages, the principles and techniques for using languages together are ad-hoc, unfriendly to programmers, and result in a poor level of integration. We work towards a principled and generic solution to language extension by studying the applicability of modular syntax definition, scannerless parsing, generalized parsing algorithms, and program transformations.

    We describe MetaBorg, a method for providing concrete syntax for domain abstractions to application programmers. Since object-oriented languages are designed for extensibility and reuse, the language constructs are often sufficient for expressing domain abstractions at the semantic level. However, they do not provide the right abstractions at the syntactic level. The MetaBorg method consists of embedding domain-specific languages in a general purpose host language and assimilating the embedded domain code into the surrounding host code. Instead of extending the implementation of the host language, the assimilation phase implements domain abstractions in terms of existing APIs leaving the host language undisturbed.

    We present a solution to injection vulnerabilities. Software written in one language often needs to construct sentences in another language, such as SQL queries, XML output, or shell command invocations. This is almost always done using unhygienic string manipulation. A client can then supply specially crafted input that causes the constructed sentence to be interpreted in an unintended way, leading to an injection attack. We describe a more natural style of programming that yields code that is impervious to injections by construction. Our approach embeds the grammars of the guest languages into that of the host language and automatically generates code that maps the embedded language to constructs in the host language that reconstruct the embedded sentences, adding escaping functions where appropriate.

    We study AspectJ as a typical example of a language conglomerate, i.e. a language composed of a number of separate languages with different syntactic styles. We show that the combination of the lexical syntax leads to considerable complexity in the lexical states to be processed. We show how scannerless parsing elegantly addresses this. We present the design of a modular, extensible, and formal definition of the lexical and context-free aspects of the AspectJ syntax. We introduce grammar mixins, which allows the declarative definition of keyword policies and combination of extensions.

    We introduce separate compilation of grammars to enable deployment of languages as plugins to a compiler. Current extensible compilers focus on source-level extensibility, which requires users to compile the compiler with a specific configuration of extensions. A compound parser needs to be generated for every combination. We introduce an algorithm for parse table composition to support separate compilation of grammars to parse table components. Parse table components can be composed (linked) efficiently at runtime, i.e. just before parsing. For realistic language combination scenarios involving grammars for real languages, our parse table composition algorithm is an order of magnitude faster than computation of the parse table for the combined grammars, making online language composition feasible.

2007

  • Martin Bravenboer and Eelco Visser. Designing Syntax Embeddings and Assimilations for Language Libraries. In Proceedings of the 4th International Workshop on Software Language Engineering (ATEM 2007), October 2007. Selected as best paper.
  • Martin Bravenboer, Eelco Dolstra, and Eelco Visser. Preventing Injection Attacks with Syntax Embeddings. A Host and Guest Language Independent Approach. In Proceedings of the Sixth International Conference on Generative Programming and Component Engineering (GPCE 2007), October 2007

    Software written in one language often needs to construct sentences in another language, such as SQL queries, XML output, or shell command invocations. This is almost always done using unhygienic string manipulation, the concatenation of constants and client-supplied strings. A client can then supply specially crafted input that causes the constructed sentence to be interpreted in an unintended way, leading to an injection attack. We describe a more natural style of programming that yields code that is impervious to injections by construction. Our approach embeds the grammars of the guest languages (e.g., SQL) into that of the host language (e.g., Java) and automatically generates code that maps the embedded language to constructs in the host language that reconstruct the embedded sentences, adding escaping functions where appropriate. This approach is generic, meaning that it can be applied with relative ease to any combination of host and guest languages.

  • Eric Bouwers, Martin Bravenboer, and Eelco Visser. Grammar Engineering Support for Precedence Rule Recovery and Compatibility Checking. In Proceedings of LDTA'07, Seventh Workshop on Language Descriptions, Tools and Applications at ETAPS'07, March 2007

    A wide range of parser generators are used to generate parsers for programming languages. The grammar formalisms that come with parser generators provide different approaches for defining operator precedence. Some generators (e.g. YACC) support precedence declarations, others require the grammar to be unambiguous, thus encoding the precedence rules. Even if the grammar formalism provides precedence rules, a particular grammar might not use it.

    The result is grammar variants implementing the same language. For the C language, the GNU Compiler uses YACC with precedence rules, the C-Transformers uses SDF without priorities, while the SDF library does use priorities. For PHP, Zend uses YACC with precedence rules, whereas PHP-front uses SDF with priority and associativity declarations.

    The variance between grammars raises the question if the precedence rules of one grammar are compatible with those of another. This is usually not obvious, since some languages have complex precedence rules. Also, for some parser generators the semantics of precedence rules is defined operationally, which makes it hard to reason about their effect on the defined language.

    We present a method and tool for comparing the precedence rules of different grammars and parser generators. Although it is undecidable whether two grammars define the same language, this tool provides support for comparing and recovering precedence rules, which is especially useful for reliable migration of a grammar from one grammar formalism to another. We evaluate our method by the application to non-trivial mainstream programming languages, such as PHP and C.

  • Martin Bravenboer, and Eelco Visser. Program Transformation with Stratego/XT. Tutorial at the European Joint Conferences on Theory and Practice of Software (ETAPS 2007), March 2007.

2006

  • Martin Bravenboer. Impact of Software Transformation Systems on Language Workbenches and Domain-Specific Language Tools. In Proceedings of STS'06, Software Transformation Systems Workshop at GPCE'06, October 2006
  • Martin Bravenboer, Eric Tanter, and Eelco Visser. Declarative, Formal, and Extensible Syntax Definition for AspectJ. In Proceedings of the 21st ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2006), October 2006

    Aspect-Oriented Programming (AOP) is attracting attention from both research and industry, as illustrated by the ever-growing popularity of AspectJ, the de facto standard AOP extension of Java. From a compiler construction perspective, AspectJ is interesting as it is a typical example of a compositional language, i.e. a language composed of a number of separate languages with different syntactical styles: in addition to plain Java, AspectJ includes a language for defining pointcuts and one for defining advices. Language composition represents a non-trivial challenge for conventional parsing techniques. First, combining several languages with different lexical syntax leads to considerable complexity in the lexical states to be processed. Second, as new language features for AOP are being explored, many research proposals are concerned with further extending the AspectJ language, resulting in a need for an extensible syntax definition.

    This paper shows how scannerless parsing elegantly addresses the issues encountered by conventional techniques when parsing AspectJ. We present the design of a modular, extensible, and formal definition of the lexical and context-free aspects of the AspectJ syntax in the Syntax Definition Formalism SDF, which is implemented by a scannerless, generalized-LR parser (SGLR). We introduce grammar mixins as a novel application of SDF's modularity features, which allows the declarative definition of different keyword policies and combination of extensions. We illustrate the modular extensibility of our definition with syntax extensions taken from current research on aspect languages. Finally, benchmarks show the reasonable performance of scannerless generalized-LR parsing for this grammar.

  • Martin Bravenboer, Karl Trygve Kalleberg, Rob Vermaas, and Eelco Visser. Building Java Transformations with Stratego/XT. Tutorial at Sixth International Conference on Generative Programming and Component Engineering (GPCE 2006), October 2006.
  • Martin Bravenboer, Rene de Groot, and Eelco Visser. MetaBorg in Action: Examples of Domain-specific Language Embedding and Assimilation using Stratego/XT. In Generative and Transformational Techniques in Software Engineering (GTTSE 2005), volume 4143 of LNCS, November, 2006.
  • Martin Bravenboer, Karl Trygve Kalleberg, Rob Vermaas, and Eelco Visser . Stratego/XT 0.16: Components for Transformation Systems. In Proceedings of the ACM SIGPLAN 2006 Workshop on Partial Evaluation and Program Manipulation (PEPM '06), January 2006.

2005

  • Eelco Dolstra, Martin Bravenboer, and Eelco Visser. Service Configuration Management. In Proceedings of the 12th International Workshop on Software Configuration Management (SCM 2005), September 2005.
  • Martin Bravenboer, Rob Vermaas, Jurgen Vinju and Eelco Visser. Generalized Type-Based Disambiguation of Meta Programs with Concrete Object Syntax. In Generative Programming and Component Engineering 4th International Conference (GPCE 2005), volume 3676 of LNCS, October 2005

    In meta programming with concrete object syntax, object-level programs are composed from fragments written in concrete syntax. The use of small program fragments in such quotations and the use of meta-level expressions within these fragments (anti-quotation) often leads to ambiguities. This problem is usually solved through explicit disambiguation, resulting in considerable syntactic overhead. A few systems manage to reduce this overhead by using type information during parsing. Since this is hard to achieve with traditional parsing technology, these systems provide specific combinations of meta and object languages, and their implementations are difficult to reuse.

    In this paper, we generalize these approaches and present a language independent method for introducing concrete object syntax without explicit disambiguation. The method uses scannerless generalized-LR parsing to parse meta programs with embedded objectlevel fragments, which produces a forest of all possible parses. This forest is reduced to a tree by a disambiguating type checker for the meta language. To validate our method we have developed embeddings of several object languages in Java, including AspectJ and Java itself.

  • Anya Helene Bagge, Martin Bravenboer, Karl Trygve Kalleberg, Koen Muilwijk, and Eelco Visser. Adaptive Code Reuse by Aspects, Cloning and Renaming. Technical Report UU-CS-2005-031, Department of Information and Computing Sciences, Universiteit Utrecht, Utrecht, The Netherlands, August 2005.
  • Martin Bravenboer, Arthur van Dam, Karina Olmos and Eelco Visser. Program Transformation with Scoped Dynamic Rewrite Rules. Fundamenta Informaticae, Volume 69, 2005.

2004

  • Martin Bravenboer and Eelco Visser. Reusable and Adaptable Strategies for Generative Programming. In Proceedings of STS'04, Software Transformation Systems Workshop at GPCE'04, Vancouver, Canada. October 2004
  • Martin Bravenboer and Eelco Visser. Concrete Syntax for Objects. Domain-Specific Language Embedding and Assimilation without Restrictions. In Proceedings of the 19th ACM SIGPLAN conference on Object-Oriented Programing, Systems, Languages, and Applications (OOPSLA'04), October 2004

    a.k.a. MetaBorg

    Application programmer's interfaces give access to domain knowledge encapsulated in class libraries without providing the appropriate notation for expressing domain composition. Since object-oriented languages are designed for extensibility and reuse, the language constructs are often sufficient for expressing domain abstractions at the semantic level. However, they do not provide the right abstractions at the syntactic level.

    In this paper we describe MetaBorg, a method for providing concrete syntax for domain abstractions to application programmers. The method consists of embedding domain-specific languages in a general purpose host language and assimilating the embedded domain code into the surrounding host code. Instead of extending the implementation of the host language, the assimilation phase implements domain abstractions in terms of existing APIs leaving the host language undisturbed. Indeed, Meta- Borg can be considered a method for promoting APIs to the language level. The method is supported by proven and available technology, i.e. the syntax definition formalism SDF and the program transformation language and toolset Stratego/XT. We illustrate the method with applications in three domains: code generation, XML generation, and user-interface construction.

2003

  • Martin Bravenboer. Connecting XML Processing and Term Rewriting with Tree Grammars . Master's thesis INF/SCR-04-08, Institute of Information and Computing Sciences, Utrecht University, November 2003
  • Martin Bravenboer. Being Declarative - Searching for the Essence of Declarativeness. Report for the course Philosophical aspects of Computer Science, Utrecht University, 2003

2002

  • Martin Bravenboer and Eelco Visser. Rewriting Strategies for Instruction Selection. In Rewriting Techniques and Applications (RTA 2002), volume 2378 of LNCS, July 2002

2001

  • Martin Bravenboer and Eelco Visser. Guiding Visitors: Separating Navigation from Computation. Technical Report UU-CS-2001-42, Institute of Information and Computing Sciences, Utrecht University, 2001