Publications
-
Yannis Smaragdakis, George Balatsouras, George Kastrinis, Martin
Bravenboer. More Sound Static Handling of Java Reflection. APLAS 2015.
-
Dung Nguyen, Molham Aref, Martin Bravenboer, George Kollias, Hung
Q. Ngo, Christopher Re, Atri Rudra. Join Processing for Graph
Patterns: An Old Dog with New Tricks. GRADES 2015.
-
Yannis Smaragdakis, Martin Bravenboer, and Ondrek Lhotak. Pick
Your Contexts Well: Understanding Object-sensitivity. POPL
2011.
-
William Marczak, Shan Shan
Huang, Martin Bravenboer, Micah Sherr, Boon Thau Loo, and Molham
Aref. SecureBlox: Customizable Secure Distributed Data
Processing. In Proceedings of the 35th SIGMOD International
Conference on Management of Data (SIGMOD 2010), June
2010.
-
Martin Bravenboer,
Yannis
Smaragdakis. Strictly Declarative Specification of
Sophisticated Points-to Analyses. In Proceedings of the 24th
ACM SIGPLAN Conference on Object-Oriented Programming, Systems,
Languages, and Applications
(OOPSLA 2009),
October 2009. To appear.
a.k.a. Doop
draft
implementation
show abstract
We present the Doop framework for points-to analysis of Java
programs. Doop builds on the idea of specifying pointer analysis
algorithms declaratively, using Datalog: a logic-based language for
defining (recursive) relations. We carry the declarative approach
further than past work by describing the full end-to-end analysis in
Datalog and optimizing aggressively using a novel technique that
takes into account Datalog incremental evaluation.
As a result, Doop achieves several benefits, including stunning
(full order-of-magnitude) improvements in runtime. We compare Doop
with Lhotak and Hendren's Paddle, which defines the state of the art
for context-sensitive analyses. For the exact same logical points-to
definitions (and, consequently, identical precision) Doop is more
than 15x faster than Paddle for a 1-call-site sensitive analysis of
the DaCapo benchmarks, with lower but still substantial speedups for
other important analyses. Additionally, Doop scales to very precise
analyses that are impossible with Paddle and Whaley et al.'s
bddbddb, directly addressing open problems in past
literature. Finally, our implementation is modular and can be easily
configured to analyses with a wide range of characteristics, largely
due to its declarativeness.
-
Martin Bravenboer,
Yannis
Smaragdakis. Exception Analysis and Points-to
Analysis: Better Together. In International
Symposium on Software Testing and Analysis (ISSTA 2009),
July 2009. To appear.
pdf
implementation
show abstract
Exception analysis and points-to analysis are typically
done in complete separation. Past algorithms for precise
exception analysis (e.g., pairing throw clauses with catch
statements) use pre-computed points-to information. Past
points-to analyses either unsoundly ignore exceptions, or
conservatively compute a crude approximation of exception
throwing (e.g., considering an exception throw as an
assignment to a global variable, accessible from any catch
clause).
We show that this separation results in significant
slowdowns or vast imprecision. The two kinds of analyses
are interdependent: neither can be performed accurately
without the other. The interdependency leads us to
propose a joint handling for performance and precision.
We show that our exception analysis is expressible highly
elegantly in a declarative form, and can apply to
points-to analyses of varying precision. In fact, our
specification of exception analysis is ``fully precise'',
as it models closely the Java exception handling
semantics. The necessary approximation is provided only
through whichever abstractions are used for contexts and
objects in the base points-to analysis.
Our combined approach achieves similar precision relative
to exceptions (exception-catch links) as the best past
precise exception analysis, with a runtime of seconds
instead of tens of minutes. At the same time, our analysis
achieves much higher precision of points-to information
(an average of half as many values for each reachable
variable for most of the DaCapo benchmarks) than points-to
analyses that treat exceptions conservatively, all at a
fraction of the execution time.
-
Martin Bravenboer,
Eelco Dolstra, and Eelco
Visser. Preventing Injection Attacks with Syntax
Embeddings. In Science of Computer
Programming, 2009. To appear.
a.k.a. StringBorg
pdf
implementation
show abstract
Software written in one language often needs to construct
sentences in another language, such as SQL queries, XML
output, or shell command invocations. This is almost
always done using unhygienic string manipulation,
the concatenation of constants and client-supplied
strings. A client can then supply specially crafted input
that causes the constructed sentence to be interpreted in
an unintended way, leading to an injection
attack.
We describe a more natural style of programming that
yields code that is impervious to injections by
construction. Our approach embeds the grammars of
the guest languages (e.g. SQL) into that of the
host language (e.g. Java) and automatically
generates code that maps the embedded language to
constructs in the host language that reconstruct the
embedded sentences, adding escaping functions where
appropriate. This approach is generic, meaning that it
can be applied with relative ease to any combination of
host and guest languages.
-
Lennart Kats, Martin Bravenboer,
and Eelco
Visser. Mixing Source and Bytecode - A Case for
Compilation by Normalization. In Proceedings of the
23st ACM SIGPLAN Conference on Object-Oriented
Programming, Systems, Languages, and Applications (OOPSLA
2008), October 2008.
pdf
doi
show abstract
Language extensions increase programmer productivity by
providing concise, often domain-specific syntax, and
support for static verification of correctness, security,
and style constraints. Language extensions can often be
realized through translation to the base language,
supported by preprocessors and extensible
compilers. However, various kinds of extensions require
further adaptation of a base compiler's internal stages
and components, for example to support separate
compilation or to make use of low-level primitives of the
platform (e.g., jump instructions or unbalanced
synchronization). To allow for a more loosely coupled
approach, we propose an open compiler model based on
normalization steps from a high-level language to a subset
of it, the core language. We developed such a compiler for
a mixed Java and (core) bytecode language, and evaluate
its effectiveness for composition mechanisms such as
traits, as well as statement-level and expression-level
language extensions.
-
Martin Bravenboer
and Eelco
Visser. Parse Table Composition - Separate
Compilation and Binary Extensibility of Grammars. In
Proceedings of 1st International Conference on Software
Language Engineering (SLE 2008)
September 2008
pdf (pre-proceedings edition)
show abstract
Module systems, separate compilation, deployment of binary
components, and dynamic linking have enjoyed wide
acceptance in programming languages and systems. In
contrast, the syntax of languages is usually defined in a
non-modular way, cannot be compiled separately, cannot
easily be combined with the syntax of other languages, and
cannot be deployed as a component for later
composition. Grammar formalisms that do support modules
use whole program compilation.
Current extensible compilers focus on source-level
extensibility, which requires users to compile the
compiler with a specific configuration of extensions. A
compound parser needs to be generated for every
combination of extensions. The generation of parse tables
is expensive, which is a particular problem when the
composition configuration is not fixed to enable users to
choose language extensions.
In this paper we introduce an algorithm for parse
table composition to support separate compilation of
grammars to parse table components. Parse table
components can be composed (linked) efficiently at
runtime, i.e. just before parsing. While the worst-case
time complexity of parse table composition is exponential
(like the complexity of parse table generation itself),
for realistic language combination scenarios involving
grammars for real languages, our parse table composition
algorithm is an order of magnitude faster than computation
of the parse table for the combined grammars.
-
Martin Bravenboer
and Eelco
Visser. Designing Syntax Embeddings and
Assimilations for Language Libraries. In Models
in Software Engineering: Workshops and Symposia at MoDELS
2007, volume 5002 of LNCS, 2008.
pdf
pdf (extended edition)
doi
show abstract
Language libraries extend regular libraries with
domain-specific notation. More precisely, a language
library is a combination of a domain-specific language
embedded in the general-purpose host language, a
regular library implementing the underlying functionality,
and an assimilation transformation that maps
embedded DSL fragments to host language code. While the
basic architecture for realizing language libraries is the
same for all applications, there are many design choices
to be made in the design of a particular combination of
library, guest language syntax, host language, and
assimilation. In this paper, we present a systematic
analysis of the design space for syntax embeddings and
assimilations for the realization of language
libraries. The contribution of this paper is an overview
of the state-of-the-art providing insight in the design
space and research questions in language library
realization, in particular, the identification of research
issues for realizing an independently extensible language
library framework.
-
Martin Bravenboer,
Karl Trygve
Kalleberg, Rob Vermaas, and Eelco
Visser. Stratego/XT 0.17. A Language and Toolset
for Program Transformation. In Science of
Computer Programming, June 2008
pdf
doi
show abstract
Stratego/XT is a language and toolset for program
transformation. The Stratego language provides rewrite
rules for expressing basic transformations, programmable
rewriting strategies for controlling the application of
rules, concrete syntax for expressing the patterns of
rules in the syntax of the object language, and dynamic
rewrite rules for expressing context-sensitive
transformations, thus supporting the development of
transformation components at a high level of
abstraction. The XT toolset offers a collection of
flexible, reusable transformation components, and tools
for generating such components from declarative
specifications. Complete program transformation systems
are composed from these components.
This paper gives an overview of Stratego/XT 0.17,
including a description of the Stratego language and XT
transformation tools; a discussion of the implementation
techniques and software engineering process; and a
description of applications built with Stratego/XT.
-
Martin
Bravenboer. Exercises in Free Syntax. Syntax
Definition, Parsing, and Assimilation of Language
Conglomerates. PhD thesis, Utrecht University,
January 2008
pdf
website
show abstract
In modern software development the use of multiple
software languages to constitute a single application is
ubiquitous. Despite the omnipresent use of combinations of
languages, the principles and techniques for using
languages together are ad-hoc, unfriendly to programmers,
and result in a poor level of integration. We work towards
a principled and generic solution to language extension by
studying the applicability of modular syntax definition,
scannerless parsing, generalized parsing algorithms, and
program transformations.
We describe MetaBorg, a method for providing concrete
syntax for domain abstractions to application
programmers. Since object-oriented languages are designed
for extensibility and reuse, the language constructs are
often sufficient for expressing domain abstractions at the
semantic level. However, they do not provide the right
abstractions at the syntactic level. The MetaBorg method
consists of embedding domain-specific languages in a
general purpose host language and assimilating the
embedded domain code into the surrounding host code.
Instead of extending the implementation of the host
language, the assimilation phase implements domain
abstractions in terms of existing APIs leaving the host
language undisturbed.
We present a solution to injection
vulnerabilities. Software written in one language often
needs to construct sentences in another language, such as
SQL queries, XML output, or shell command invocations.
This is almost always done using unhygienic string
manipulation. A client can then supply specially crafted
input that causes the constructed sentence to be
interpreted in an unintended way, leading to an injection
attack. We describe a more natural style of programming
that yields code that is impervious to injections by
construction. Our approach embeds the grammars of the
guest languages into that of the host language and
automatically generates code that maps the embedded
language to constructs in the host language that
reconstruct the embedded sentences, adding escaping
functions where appropriate.
We study AspectJ as a typical example of a language
conglomerate, i.e. a language composed of a number of
separate languages with different syntactic styles. We
show that the combination of the lexical syntax leads to
considerable complexity in the lexical states to be
processed. We show how scannerless parsing elegantly
addresses this. We present the design of a modular,
extensible, and formal definition of the lexical and
context-free aspects of the AspectJ syntax. We introduce
grammar mixins, which allows the declarative definition of
keyword policies and combination of extensions.
We introduce separate compilation of grammars to enable
deployment of languages as plugins to a compiler. Current
extensible compilers focus on source-level extensibility,
which requires users to compile the compiler with a
specific configuration of extensions. A compound parser
needs to be generated for every combination. We introduce
an algorithm for parse table composition to support
separate compilation of grammars to parse table
components. Parse table components can be composed
(linked) efficiently at runtime, i.e. just before
parsing. For realistic language combination scenarios
involving grammars for real languages, our parse table
composition algorithm is an order of magnitude faster than
computation of the parse table for the combined grammars,
making online language composition feasible.
-
Martin
Bravenboer. Connecting XML Processing and Term
Rewriting with Tree Grammars . Master's thesis
INF/SCR-04-08, Institute of Information and Computing
Sciences, Utrecht University, November 2003
ps
pdf
presentation)
-
Martin
Bravenboer. Being Declarative - Searching for the
Essence of Declarativeness. Report for the course
Philosophical aspects of Computer Science, Utrecht University,
2003
ps
-
Martin Bravenboer
and Eelco
Visser. Rewriting Strategies for Instruction
Selection. In Rewriting Techniques and
Applications (RTA 2002), volume 2378 of LNCS, July 2002
pdf
doi
-
Martin Bravenboer
and Eelco
Visser. Guiding Visitors: Separating Navigation
from Computation. Technical Report UU-CS-2001-42,
Institute of Information and Computing Sciences, Utrecht
University, 2001
pdf
|
|