Mikołaj Bojańczyk

Data trees


July 14, 2015
  1. Mikołaj Bojańczyk, Anca Muscholl, Thomas Schwentick, Luc Segoufin
    Two-variable logic on data trees and XML reasoning. J. ACM, 2009   PDF

  2. Henrik Björklund, Mikołaj Bojańczyk
    Bounded Depth Data Trees. ICALP, 2007   PDF

  3. Mikołaj Bojańczyk, Slawomir Lasota 0001
    An extension of data automata that captures XPath Log. Methods Comput. Sci., 2012   PDF

  4. Vince Bárány, Mikołaj Bojańczyk, Diego Figueira, Pawel Parys
    Decidable classes of documents for XPath. FSTTCS, 2012   PDF

  5. Mikołaj Bojańczyk, Filip Murlak, Adam Witkowski
    Containment of Monadic Datalog Programs via Bounded Clique-Width. ICALP (2), 2015   PDF

These are papers about data trees, which are the tree version of data words, i.e. these are trees where each node has a label from a finite alphabet and a data value from an infinite alphabet, with the provision that data values can only be compared for equality. The motivation behind data trees is that they are supposed to be like XML documents. This is a sequence of papers which tries to tackle the satisfiability problem for various logics over data trees. All of the papers use automata that run over data trees.

Paper [1] is about two-variable logic over data trees. The logic has access to child and next-sibling predicates, as well as the equal data value predicate. It does not have access to the descendant predicate. Using a heroic amount of cutting and pasting, we prove that satisfiability for the logic is decidable. The paper is a journal version of

  • Mikołaj Bojańczyk, Claire David, Anca Muscholl, Thomas Schwentick, Luc Segoufin
    Two-variable logic on data trees and XML reasoning. PODS, 2006   PDF

which had the best paper award at PODS 2006, and the test of time award at PODS 2016. Here is a picture of my lovely co-authors and me in 2016:

IMG_0916

 

Paper [2] is about bounded depth data trees, bounded by the typically small depth of tree-shaped databases. At the cost of bounding the depth, we can extend the logic from [1] with descendant order (not very interesting for bounded depth) and document order (more interesting), and still have decidable satisfiability. The underlying decision problem is priority vector addition systems, which are an extension of vector addition systems that can do tests for zero in a certain nested way, with the nesting essentially corresponding to the tree structure of bounded depth data trees.

Paper [3] is a journal version of

  • Mikołaj Bojańczyk, Slawomir Lasota 0001
    An Extension of Data Automata that Captures XPath. LICS, 2010   PDF

Like for all papers in this group, the motivation behind papers [3,4] is to do static analysis of XPath, e.g. decide if a formula of XPath is true in some data word or tree. The problem is known to be undecidable. One workaround is to consider fragments of XPath, which is the approach of the previous papers [1,2]. Another approach is to find a restriction of data words or trees, such that most real-life documents satisfy this restriction (of course, this requires an understanding of real life that is missing in the analysis), and then prove that XPath satisfiability is decidable over documents that satisfy this restriction. One example of such a restriction is bounded depth from [2], but here we try to go for something more ambitious. In order to meaningfully analyze XPath, it is useful to have an automaton model, and such a model is introduced in paper [3], which is a journal version of

  • Mikołaj Bojańczyk, Slawomir Lasota 0001
    An Extension of Data Automata that Captures XPath. LICS, 2010   PDF

Then, using this automaton model, we prove in [4] that, given a certain structural restriction on data trees, emptiness is decidable for the model from [3]. The restriction is difficult to explain, but the idea is to have something that is not a simple window dressing on established notions of well-behaved structures like tree width or clique width.

Speaking of clique width, paper [5] uses clique width to explain some existing decidability results on Datalog containment.

 

Leave a Reply

Your email address will not be published.