Exception Analysis and Points-to Analysis: Better
Together
|
Presented at International Symposium on Software Testing and
Analysis (ISSTA 2009), July 2009.
toggle abstract
Exception analysis and points-to analysis are typically
done in complete separation. Past algorithms for precise
exception analysis (e.g., pairing throw clauses with catch
statements) use pre-computed points-to information. Past
points-to analyses either unsoundly ignore exceptions, or
conservatively compute a crude approximation of exception
throwing (e.g., considering an exception throw as an
assignment to a global variable, accessible from any catch
clause).
We show that this separation results in significant
slowdowns or vast imprecision. The two kinds of analyses
are interdependent: neither can be performed accurately
without the other. The interdependency leads us to
propose a joint handling for performance and precision.
We show that our exception analysis is expressible highly
elegantly in a declarative form, and can apply to
points-to analyses of varying precision. In fact, our
specification of exception analysis is ``fully precise'',
as it models closely the Java exception handling
semantics. The necessary approximation is provided only
through whichever abstractions are used for contexts and
objects in the base points-to analysis.
Our combined approach achieves similar precision relative
to exceptions (exception-catch links) as the best past
precise exception analysis, with a runtime of seconds
instead of tens of minutes. At the same time, our analysis
achieves much higher precision of points-to information
(an average of half as many values for each reachable
variable for most of the DaCapo benchmarks) than points-to
analyses that treat exceptions conservatively, all at a
fraction of the execution time.
|
Software Development Challenges: Abstraction and Analysis
|
Presented at University of Waterloo, Canada, April 2, 2009
toggle abstract
In modern software development the use of multiple software
languages to constitute a single application is
ubiquitous. Despite the omnipresent use of combinations of
languages, the methods of using languages together are
ad-hoc, unfriendly to programmers, and result in a poor
level of integration. We present examples of problematic
integration, such as security vulnerabilities, and show how
a principled and generic solution to language extensibility
solves these problems. The parsing techniques we use for
implementing language extensions avoid common pitfalls and
challenges in combining languages. Furthermore, we enable
the deployment of syntax extensions as separately compiled,
binary plugins to an extensible compiler by a new algorithm
for parse table composition. This drastically changes the
way extensible compilers can be used by end-programmers.
To give programmers better automated tools to analyze
programs, we present a strictly declarative approach for the
specification of sophisticated points-to analyses. Pointer
analysis is the most fundamental program analysis and has a
long and rich history of research and application. Our
points-to analysis framework for Java, called Doop, builds
on the idea of specifying pointer analysis algorithms
declaratively, using Datalog: a logic-based language for
defining (recursive) relations. We carry the declarative
approach further than past work by describing the full
end-to-end analysis in Datalog, which was thought to be
impossible. The pointer analysis specifications are
optimized aggressively through a novel technique that takes
into account Datalog incremental evaluation. As a result,
Doop achieves a full order of magnitude improvement in
runtime over the current state of the art in
context-sensitive pointer analysis. Additionally, Doop
scales to very precise analyses that are impossible with
previous pointer analysis algorithms, directly addressing
open problems in past literature.
|
Strictly Declarative Specification of Sophisticated Points-To
Analyses
|
Presented at New York University, March 30, 2009
Earlier version presented at University of Texas at San
Antonio and the University of Massachusetts Amherst
toggle abstract
We present the Doop framework for points-to analysis of Java
programs. Pointer analysis is the most fundamental program
analysis and has a long and rich history of research and
application. Doop builds on the idea of specifying pointer
analysis algorithms declaratively, using Datalog: a
logic-based language for defining (recursive) relations. We
carry the declarative approach further than past work by
describing the full end-to-end analysis in Datalog, which
was thought to be impossible. The pointer analysis
specifications are optimized aggressively through exposition
of the representation of relations (e.g. indexing) to the
Datalog language level and a novel technique that takes into
account Datalog incremental evaluation.
As a result, Doop achieves a full order of magnitude
improvement in runtime over the current state of the art in
context-sensitive pointer analysis. Additionally, Doop
scales to very precise analyses that are impossible with
previous pointer analysis algorithms, directly addressing
open problems in past literature. Finally, our
implementation is modular and can be easily configured to
analyses with a wide range of characteristics, largely due
to its declarativeness.
Our work is a strong data point in support of declarative
languages: we believe that humans cannot implement and
optimize well complex mutually-recursive definitions at an
operational level of abstraction, but can do so using
declarative specifications. The work also demonstrates that
representing relations using binary decision diagrams, as in
past precise pointer analysis work, is not essential for the
most important configurations, and even detrimental to
performance.
|
Screaming Fast Declarative Pointer Analysis
|
Presented at New England Programming Languages and Systems
Symposium Series (NEPLS), March 5, 2009
toggle abstract
We present the Doop framework for points-to analysis of Java
programs. Doop builds on the idea of specifying pointer
analysis algorithms declaratively, using Datalog: a
logic-based language for defining (recursive) relations. We
carry the declarative approach further than past work by
describing the full end-to-end analysis in Datalog and
optimizing aggressively through exposition of the
representation of relations (e.g., indexing) to the Datalog
language level.
As a result, Doop achieves stunning (full
order-of-magnitude) improvements in runtime. We compare Doop
with Lhotak and Hendren's Paddle, which defines the state of
the art for context-sensitive analyses. For the exact same
logical points-to definitions (and, consequently, identical
precision) Doop is 12.9x faster than Paddle for a
1-call-site sensitive analysis of the DaCapo benchmarks, and
10.2x faster than Paddle for a 1-object-sensitive
analysis. At the same time, our implementation is modular
and can be easily configured to analyses with a wide range
of characteristics, largely due to its declarativeness.
|
Pointer Analysis
|
Presented at LogicBlox, Atlanta, May 6, 2008
|
Towards Intelligent Coding Assistance for Code Generation in
Modeling Frameworks
|
Delft University of Technology, The Netherlands, MoDSE
Colloquium, November, 2007
toggle abstract
Development environments for domain-specific language tools
and model-driven development are currently mostly focusing
on supporting the metamodelling and modelling aspects of
model-driven engineering. The actual code generators are
usually implemented in conventional template languages. Such
template languages allow the metaprogrammer to write code
templates using the syntax of the generated program,
i.e. the syntax of the object language. The literal code of
the program to generate can typically be mixed with meta
constructs to generate code conditionally, iteratively, or
based on properties of the input model.
Unfortunately, the development environment nor the template
engine give feedback about syntactic or semantic errors in
the code templates. Hence, the only way to check if the
code generator produces programs that actually compile is to
apply the code generator to an input model. The
disadvantage of this approach is that the metaprogrammer
does not get immediate feedback about the validity of code
generator when he is writing the code templates. During
testing the syntactic or semantic errors need to be traced
back to the code generator. Moreover, test models might not
cover the entire code generator or the possible input
models, so errors might be reported even after deployment of
the code generator. Finally, there is no support for various
forms of code completion and assistance, which are widely
appreciated IDE features.
Despite code generators being a major aspect of model-driven
development, this makes the development of the actual code
generator feel rather unsupported. Basically, the
metaprogrammer leaves the shiny and supportive development
environment for this important part of a model-driven
engineering project. In academic research, a range of
solutions have been proposed to make code generation tools
syntactically or semantically aware of the object
language. This awareness of the object language can be used
to guarantee that the generated programs are always
syntactically correct, or in the case of semantical
awareness might even always compile. Also, this awareness
could be used to support coding assistance.
In this talk, I will first give a brief overview of the
research in this area. I will discuss the various levels of
understanding of the object program that have been
researched. After this overview, I will take a closer look
at a method for adding syntactical awareness of an object
language to code generation tools. In particular, I will
discuss the parsing techniques required for parsing
combinations of a template language and possibly multiple
object languages. These parsing techniques are applied in
several academic tools that can be applied for code
generation, e.g. the ASF+SDF MetaEnvironment, Stratego/XT,
StringBorg, and Repleo. Finally, I will discuss the
usability issues of the current academic tools that support
syntactical awareness and challenges to apply these ideas in
interactive development environments.
|
Grammar Engineering Support for Precedence Rule Recovery and
Compatibility Checking
The State of Stratego/XT
Introducing the Stratego Libraries
MetaBorg: An Approach for Domain-Specific Language Embedding
|
MoDSE
(Model-Driven Software Evolution) kickoff meeting, 2006-11-17,
Delft University of Technology, Delft, The Netherlands
|
Declarative, Formal, and Extensible Syntax Definition for
AspectJ. A Case for Scannerless Generalized-LR
Parsing.
|
OOPSLA 2006, Portland, USA, October 2006
Based on article: Declarative,
Formal, and Extensible Syntax Definition for AspectJ. A Case
for Scannerless Generalized-LR Parsing.
toggle abstract
Aspect-Oriented Programming (AOP) is attracting attention
from both research and industry, as illustrated by the
ever-growing popularity of AspectJ, the de facto standard
AOP extension of Java. From a compiler construction
perspective, AspectJ is interesting as it is a typical
example of a compositional language, i.e. a
language composed of a number of separate languages with
different syntactical styles: in addition to plain Java,
AspectJ includes a language for defining pointcuts and one
for defining advices. Language composition represents a
non-trivial challenge for conventional parsing
techniques. First, combining several languages with
different lexical syntax leads to considerable complexity
in the lexical states to be processed. Second, as new
language features for AOP are being explored, many
research proposals are concerned with further
extending the AspectJ language, resulting in a need
for an extensible syntax definition.
This paper shows how scannerless parsing
elegantly addresses the issues encountered by conventional
techniques when parsing AspectJ. We present the design of
a modular, extensible, and formal definition of the
lexical and context-free aspects of the AspectJ syntax in
the Syntax Definition Formalism SDF, which is implemented
by a scannerless, generalized-LR parser (SGLR). We
introduce grammar mixins as a novel application
of SDF's modularity features, which allows the declarative
definition of different keyword policies and combination
of extensions. We illustrate the modular extensibility of
our definition with syntax extensions taken from current
research on aspect languages. Finally, benchmarks show
the reasonable performance of scannerless generalized-LR
parsing for this grammar.
|
Infrastructure for Java Program Transformation
|
Master Course Program Transformation, Utrecht University,
2006-02-09
|
Infrastructure for Program Transformation Systems
|
Master Course Program Transformation, Utrecht University,
2006-02-07
|
Components: Concepts and Techniques
|
Software Engineering Course, Utrecht University, 2005-10-19
|
Testing Tools and Techniques
|
Software Engineering Course, Utrecht University, 2005-10-12
|
Generalized Type-Based Disambiguation of Meta Programs with
Concrete Object Syntax
|
GPCE 2005, Tallin, Estonia, 2005-09-30
Based on article: Generalized Type-Based
Disambiguation of Meta Programs with Concrete Object
Syntax
toggle abstract
In meta programming with concrete object syntax,
object-level programs are composed from fragments written in
concrete syntax. The use of small program fragments in such
quotations and the use of meta-level expressions within
these fragments (anti-quotation) often leads to
ambiguities. This problem is usually solved through explicit
disambiguation, resulting in considerable syntactic
overhead. A few systems manage to reduce this overhead by
using type information during parsing. Since this is hard to
achieve with traditional parsing technology, these systems
provide specific combinations of meta and object languages,
and their implementations are difficult to reuse.
In this paper, we generalize these approaches and present a
language independent method for introducing
concrete object syntax without explicit disambiguation. The
method uses scannerless generalized-LR parsing to parse meta
programs with embedded objectlevel fragments, which produces
a forest of all possible parses. This forest is reduced to a
tree by a disambiguating type checker for the meta
language. To validate our method we have developed
embeddings of several object languages in Java, including
AspectJ and Java itself.
|
MetaBorg in Action: Examples of Domain-specific Language
Embedding and Assimilation using Stratego/XT.
|
Summer School on Generative and Transformational Techniques in
Software Engineering (GTTSE 2005), Braga, Portugal, July,
2005.
|
Dryad & Java Front. Infrastructure for Java Transformation
Systems.
The State of Stratego/XT
Meta Programming with Concrete Object Syntax
|
Master Course Program Transformation, Utrecht University, 2005-03-08
|
Infrastructure for Program Transformation Systems (Extended version)
| Master Course Program Transformation, Utrecht University, 2005-02-10
|
Infrastructure for Program Transformation Systems
|
Stratego/XT Tutorial at Philips, Eindhoven, The Netherlands, 2005-01-24
|
Concrete Syntax for Objects
|
OOPSLA'04, Vancouver, Canada, 2004-10-28
Based on article: Concrete Syntax
for Objects
toggle abstract
Application programmer's interfaces give access to domain
knowledge encapsulated in class libraries without
providing the appropriate notation for expressing domain
composition. Since object-oriented languages are designed
for extensibility and reuse, the language constructs are
often sufficient for expressing domain abstractions at the
semantic level. However, they do not provide the right
abstractions at the syntactic level.
In this paper we describe MetaBorg, a method for providing
concrete syntax for domain abstractions to application
programmers. The method consists of embedding
domain-specific languages in a general purpose host
language and assimilating the embedded domain code into
the surrounding host code. Instead of extending the
implementation of the host language, the assimilation
phase implements domain abstractions in terms of existing
APIs leaving the host language undisturbed. Indeed, Meta-
Borg can be considered a method for promoting APIs to the
language level. The method is supported by proven and
available technology, i.e. the syntax definition formalism
SDF and the program transformation language and toolset
Stratego/XT. We illustrate the method with applications in
three domains: code generation, XML generation, and
user-interface construction.
|
Components: Concepts and Techniques
|
Software Engineering Course, Utrecht University, 2004-10-12
|
Testing Tools and Techniques
|
Software Engineering Course, Utrecht University, 2004-09-30
|
Concrete Syntax for Objects
|
Software Technology Colloquium, Utrecht University, 2004-06-03
|
Stratego Shell
|
Fifth Stratego User Days, Utrecht University,
2004-03-01
|
Language Engineering Tools
|
Parse-Unit, Stratego-Box, and Abstract Syntax Definitions,
Fifth Stratego User Days, Utrecht University, 2004-03-01
|
XTC Support in the Stratego Shell
|
Fifth Stratego User Days, Utrecht University, 2004-03-03
|
XML Processing and Term Rewriting
|
Program Transformation course, Utrecht University, 2004-02-21
|
Connecting XML Processing and Term Rewriting with Tree Grammars
|
Software Technology Colloquium (Master talk), 2003-11-20
|
XML Transformations and Distributed Services: Stratego & XML
|
Program Transformation course, Utrecht University, 2003-02-17
|
JavaFront: Techniques and Applications
|
Trafo Meeting, Utrecht University, September 17, 2002
|
Making XT XML capable
|
Centrum voor Wiskunde en Informatica (CWI), May 30, 2002
|
Term Annotation in Stratego
| Third Stratego Users Day, May 3, 2002
|
Ant: Another Neat Tool
|
Trafo Meeting, Utrecht University
|
|
|