Publications

A Subquadratic Approximation Scheme for Partition
Marcin Mucha, Karol Wegrzycki, Michal Wlodarczyk
SODA 2019
arXiv
Abstract
The subject of this paper is the time complexity of approximating Knapsack, Subset Sum, Partition, and some other related problems. The main result is an O˜(n+1/ε^(5/3)) time randomized FPTAS for Partition, which is derived from a certain relaxed form of a randomized FPTAS for Subset Sum. To the best of our knowledge, this is the first NP-hard problem that has been shown to admit a subquadratic time approximation scheme, i.e., one with time complexity of O((n+1/ε)^(2−δ)) for some δ>0. To put these developments in context, note that a quadratic FPTAS for Partition has been known for 40 years.
Our main contribution lies in designing a mechanism that reduces an instance of Subset Sum to several simpler instances, each with some special structure, and keeps track of interactions between them. This allows us to combine techniques from approximation algorithms, pseudo-polynomial algorithms, and additive combinatorics.
We also prove several related results. Notably, we improve approximation schemes for 3-SUM, (min,+)-convolution, and Tree Sparsity. Finally, we argue why breaking the quadratic barrier for approximate Knapsack is unlikely by giving an Ω((n+1/ε)^(2−o(1))) conditional lower bound.
Applying deep learning to right whale photo identification
Robert Bogucki, Marek Cygan, Christin Brangwynne Khan, Maciej Klimek, Jan Kanty Milczek, Marcin Mucha
Conservation Biology
DOI
Abstract
Photo identification is an important tool for estimating abundance and monitoring population trends over time. However, manually matching photographs to known individuals is time-consuming. Motivated by recent developments in image recognition, we hosted a data science challenge on the crowdsourcing platform Kaggle to automate the identification of endangered North Atlantic right whales (Eubalaena glacialis).The winning solution automatically identified individual whales with 87% accuracy with a series of convolutional neural networks to identify the region of interest on an image, rotate, crop, and create standardized photographs of uniform size and orientation and then identify the correct individual whale from these passport-like photographs. Recent advances in deep learning coupled with this fully automated workflow have yielded impressive results and have the potential to revolutionize traditional methods for the collection of data on the abundance and distribution of wild populations. Presenting these results to a broad audience should further bridge the gap between the data science and conservation science communities.
Online Facility Location with Deletions
Marek Cygan, Artur Czumaj, Marcin Mucha, Piotr Sankowski
ESA 2018
arXiv
Abstract
In this paper we study three previously unstudied variants of the online Facility Location problem, considering an intrinsic scenario when the clients and facilities are not only allowed to arrive to the system, but they can also depart at any moment.
We begin with the study of a natural fully-dynamic online uncapacitated model where clients can be both added and removed. When a client arrives, then it has to be assigned either to an existing facility or to a new facility opened at the client’s location. However, when a client who has been also one of the open facilities is to be removed, then our model has to allow to reconnect all clients that have been connected to that removed facility. In this model, we present an optimal O(log n_act / log log n_act)-competitive algorithm, where n_act is the number of active clients at the end of the input sequence.
Next, we turn our attention to the capacitated Facility Location problem. We first note that if no deletions are allowed, then one can achieve an optimal competitive ratio of O(log n/ log log n), where n is the length of the sequence. However, when deletions are allowed, the capacitated version of the problem is significantly more challenging than the uncapacitated one. We show that still, using a more sophisticated algorithmic approach, one can obtain an online O(log m + log c log n)-competitive algorithm for the capacitated Facility Location problem in the fully dynamic model, where m is number of points in the input metric and c is the capacity of any open facility.
On problems equivalent to (min, +)-convolution
Marek Cygan, Marcin Mucha, Karol Wegrzycki, Michal Wlodarczyk
ICALP 2017
arXiv
Abstract
In the recent years, significant progress has been made in explaining apparent hardness of improving over naive solutions for many fundamental polynomially solvable problems. This came in the form of conditional lower bounds – reductions to one of problems assumed to be hard. These include 3SUM, All-Pairs Shortest Paths, SAT and Orthogonal Vectors, and others.

In this paper we undertake a systematic study of the (min,+)-convolution problem as a hardness assumption.
As the first step, we establish equivalence of this problem to a group of other problems, including variants of the classic knapsack problem and problems related to subadditive sequences. The (min,+)-convolution has been used as a building block in algorithms for many problems, notably problems in stringology. It has also already appeared as an ad hoc hardness assumption. We investigate some of these connections and provide new reductions and other results.

Dynamic beats fixed: on phase-based algorithms for file migration
Marcin Bienkowski, Jaroslaw Byrka, Marcin Mucha
ICALP 2017
arXiv
Abstract
In this paper, we construct a deterministic 4-competitive algorithm for the online file migration problem, beating the currently best 20-year old, 4.086-competitive MTLM algorithm by Bartal et al. (SODA 1997). Like MTLM, our algorithm also operates in phases, but it adapts their lengths dynamically depending on the geometry of requests seen so far. The improvement was obtained by carefully analyzing a linear model (factor-revealing LP) of a single phase of the algorithm. We also show that if an online algorithm operates in phases of fixed length and the adversary is able to modify the graph between phases, no algorithm can beat the competitive ratio of 4.086.
Online Pricing with Impatient Bidders
Marek Cygan, Marcin Mucha, Piotr Sankowski, Qiang Zhang
SODA 2016
pdf
Abstract
In this paper we consider the following online pricing problem. An auctioneer is selling identical items in unlimited supply, whereas each bidder from a given set is interested in purchasing a single copy of the item. Each bidder is characterized by a budget and a time interval, in which he is considering to buy the item. Bidders are willing to buy the item at the earliest time provided it is within their time intervals and the price at that time is within their budgets. We call such bidders {\em impatient bidders}. The problem is considered in the online setting, i.e., each bidder arrives at the start of his time interval, and only then an algorithm learns of his existence and his budget. The goal of the seller is to set the price of the item over time so that the total revenue is maximized.

We study two versions of the impatient bidders problem: the one introduced by Bansal et al. (TALG 2010), and a more restricted setting in which the deadline of each bidder remains unknown until it is hit. We give tight bounds for both settings. Rather surprisingly, in both cases the optimum competitive ratios are the same. In particular we prove that the competitive ratio of an optimum deterministic algorithm is $Θ(log h / log log h), whereas for randomized algorithms it is Θ(log log h).

New Bounds for Online Packing LPs
Matthias Englert, Nicolaos Matsakis, Marcin Mucha
LATIN 2014
pdf
Abstract
Solving linear programs online has been an active area of research in recent years and was used with great success to develop new online algorithms for a variety of problems. We study the setting introduced by Ochel et al. as an abstraction of lifetime optimization of wireless sensor networks.

In this setting, the online algorithm is given a packing LP and has to monotonically increase LP variables in order to maximize the objective function. However, at any point in time, the adversary only provides an α-approximation of the remaining slack for each constraint. This is designed to model scenarios in which only estimates of remaining capacities (e.g. of batteries) are known, and they get more and more accurate as the remaining capacities approach 0.

We give new upper and lower bounds for this problem.

No-Wait Flowshop Scheduling is as Hard as Asymmetric Traveling Salesman Problem
Marcin Mucha, Maxim Sviridenko
ICALP 2013
Mathematics of Operations Research 2016
pdf arXiv DOI
Abstract
In this paper we study the classical no-wait flowshop scheduling problem with makespan objective (F|no-wait|C_{max} in the standard three-field notation). This problem is well-known to be a special case of the asymmetric traveling salesman problem (ATSP) and as such has an approximation algorithm with logarithmic performance guarantee. In this work we show a reverse connection, we show that any polynomial time α-approximation algorithm for the no-wait flowshop scheduling problem with makespan objective implies the existence of a polynomial-time (α+ε)-approximation algorithm for the ATSP, for any ε>0. This in turn implies that all non-approximability results for the ATSP (current or future) will carry over to its special case. In particular, it follows that no-wait flowshop problem is APX-hard, which is the first non-approximability result for this problem.
Catch Them if You Can: Serving Impatient Users
Marek Cygan, Matthias Englert, Anupam Gupta, Marcin Mucha, Piotr Sankowski
ITCS 2013
pdf
Abstract
Consider the following problem of serving impatient users: we are given a set of customers we would like to serve. We can serve at most one customer in each time step (getting value v_i for serving customer i). At the end of each time step, each as-yet-unserved customer i leaves the system independently with probability q_i, never to return. What strategy should we use to serve customers to maximize the expected value collected?
The standard model of competitive analysis can be applied to this problem: picking the customer with maximum value gives us half the value obtained by the optimal algorithm, and using a vertex weighted online matching algorithm gives us 1-1/e ≈ 0.632 fraction of the optimum. As is usual in competitive analysis, these approximations compare to the best value achievable by an clairvoyant adversary that knows all the coin tosses of the customers. Can we do better?
We show an upper bound of ≈ 0.648 if we compare our performance to such an clairvoyant algorithm, suggesting we cannot improve our performance substantially. However, these are pessimistic comparisons to a much stronger adversary: what if we compare ourselves to the optimal strategy for this problem, which does not have an unfair advantage? In this case, we can do much better: in particular, we give an algorithm whose expected value is at least 0.7 of that achievable by the optimal algorithm. This improvement is achieved via a novel rounding algorithm, and a non-local analysis.
Lyndon Words and Short Superstrings
Marcin Mucha
SODA 2013
pdf arXiv
Abstract
In the shortest superstring problem, we are given a set of strings {s_1,…,s_k} and want to find a string that contains all s_i as substrings and has minimum length. This is a classical problem in approximation and the best known approximation factor is 2 1/2, given by Sweedyk in 1999. Since then no improvement has been made, howerever two other approaches yielding a 2 1/2-approximation algorithms have been proposed by Kaplan et al. and recently by Paluch et al. – both based on a reduction to maximum asymmetric TSP path (MAX-ATSPP) and structural results of Breslauer et al.
In this paper we give an algorithm that achieves an approximation ratio of 2 11/23, breaking through the long-standing bound of 2 1/2. We use the standard reduction of the shortest superstring problem to MAX-ATSPP. The new, somewhat surprising, algorithmic idea is to take the better of the two solutions obtained by using: (a) the currently best 2/3-approximation algorithm for MAX-ATSPP, and (b) a naive cycle-cover based 1/2-approximation algorithm. To prove that this indeed results in an improvement, we further develop a theory of string overlaps, extending the results of Breslauer et al. This theory is based on the novel use of Lyndon words, as a substitute for generic unbordered rotations and critical factorizations, as used by Breslauer et al.
A 9k Kernel for Nonseparating Independent Set in Planar Graphs
Łukasz Kowalik, Marcin Mucha
WG 2012
Theoretical Computer Science 2014
arXiv DOI DOI
Abstract
We study kernelization (a kind of efficient preprocessing) for NP-hard problems on planar graphs. Our main result is a kernel of size at most 9k vertices for the Planar Maximum Nonseparating Independent Set problem. A direct consequence of this result is that Planar Connected Vertex Cover has no kernel with at most 9/8k vertices, assuming P ≠ NP. We also show a very simple 5k-vertices kernel for Planar Max Leaf, which results in a lower bound of 5/4k vertices for the kernel of Planar Connected Dominating Set (also under P ≠ NP).
13/9-approximation for Graphic TSP
Marcin Mucha
STACS 2012
Theory of Computing Systems (STACS 2012 issue)
arXiv DOI DOI
Abstract
The Travelling Salesman Problem is one the most fundamental and most studied problems in approximation algorithms. For more than 30 years, the best algorithm known for general metrics has been Christofides’s algorithm with approximation factor of 3/2, even though the so-called Held-Karp LP relaxation of the problem is conjectured to have the integrality gap of only 4/3. Very recently, significant progress has been made for the important special case of graphic metrics, first by Oveis Gharan et al., and then by Momke and Svensson. In this paper, we provide an improved analysis for the approach introduced by Momke and Svensson yielding a bound of 13/9 on the approximation factor, as well as a bound of 19/12+ε for any ε>0 for a more general Travelling Salesman Path Problem in graphic metrics.
Approximation Algorithms for Union and Intersection Covering Problems
Marek Cygan, Fabrizio Grandoni, Stefano Leonardi, Marcin Mucha,
Marcin Pilipczuk, Piotr Sankowski
FSTTCS 2011
arXiv DOI
Abstract
In a classical covering problem, we are given a set of requests that we need to satisfy (fully or partially), by buying a subset of items at minimum cost. For example, in the k-MST problem we want to find the cheapest tree spanning at least k nodes of an edge-weighted graph. Here nodes and edges represent requests and items, respectively.
In this paper, we initiate the study of a new family of multi-layer covering problems. Each such problem consists of a collection of h distinct instances of a standard covering problem (layers), with the constraint that all layers share the same set of requests. We identify two main subfamilies of these problems: – in a union multi-layer problem, a request is satisfied if it is satisfied in at least one layer; – in an intersection multi-layer problem, a request is satisfied if it is satisfied in all layers. To see some natural applications, consider both generalizations of k-MST. Union k-MST can model a problem where we are asked to connect a set of users to at least one of two communication networks, e.g., a wireless and a wired network. On the other hand, intersection k-MST can formalize the problem of connecting a subset of users to both electricity and water.
We present a number of hardness and approximation results for union and intersection versions of several standard optimization problems: MST, Steiner tree, set cover, facility location, TSP, and their partial covering variants.
Fast Dynamic Transitive Closure with Lookahead
Marcin Mucha, Piotr Sankowski
Algorithmica 56(2) 2010
DOI
Abstract
In this paper we consider the problem of dynamic transitive closure with lookahead. We present a randomized one-sided error algorithm with updates and queries in O(nω(1,1,ε)−ε) time given a lookahead of nε operations, where ω(1,1,ε) is the exponent of multiplication of n×n matrix by n×nε matrix. For ε≤0.294 we obtain an algorithm with queries and updates in O(n2−ε) time, whereas for ε=1 the time is O(nω−1). This is essentially optimal as it implies an O(nω) algorithm for boolean matrix multiplication. We also consider the offline transitive closure in planar graphs. For this problem, we show an algorithm that requires O(nω/2) time to process n1/2 operations. We also show a modification of these algorithms that gives faster amortized queries. Finally, we give faster algorithms for restricted type of updates, so called element updates. All of the presented algorithms are randomized with one-sided error.
All our algorithms are based on dynamic algorithms with lookahead for matrix inverse, which are of independent interest.
Fast Approximation in Subspaces by Doubling Metric Decomposition
Marek Cygan, Łukasz Kowalik, Marcin Mucha, Marcin Pilipczuk, Piotr Sankowski
ESA 2010
arXiv DOI
Abstract
In this paper we propose and study a new complexity model for approximation algorithms. The main motivation are practical problems over large data sets that need to be solved many times for different scenarios, e.g., many multicast trees that need to be constructed for different groups of users. In our model we allow a preprocessing phase, when some information of the input graph G = (V,E) is stored in a limited size data structure. Next, the data structure enables processing queries of the form “solve problem A for an input S ⊆ V”. We consider problems like Steiner Forest, Facility Location, k-Median, k-Center and TSP in the case when the graph induces a doubling metric. Our main results are data structures of near-linear size that are able to answer queries in time close to linear in |S|. This improves over typical worst case reuniting time of approximation algorithms in the classical setting which is Ω(|E|) independently of the query size. In most cases, our approximation guarantees are arbitrarily close to those in the classical setting. Additionally, we present the first fully dynamic algorithm for the Steiner Tree problem.
Two Approximation Algorithms for ATSP with Strengthened Triangle Inequality
Łukasz Kowalik, Marcin Mucha
WADS 2009
pdf DOI
Abstract
In this paper, we study the asymmetric traveling salesman problem (ATSP) with strengthened triangle inequality, i.e. for some γ∈[1/2,1) the edge weights satisfy w(u,v) ≤ γ(w(u,x) + w(x,v)) for all distinct vertices u,v,x.
We present two approximation algorithms for this problem. The first one is very simple and has approximation ratio 1/[2(1−γ)], which is better than all previous results for all γ∈(1/2,1). The second algorithm is more involved but it also gives a much better approximation ratio: (2−γ)/[3(1−γ)]+O(1/n) when γ > γ0, and 1/2(1+γ)2 for any ε> 0 when γ ≤ γ0, where γ0 ≈ 0.7003.
A 7/9 – Approximation Algorithm for the Maximum Traveling Salesman Problem
Katarzyna E. Paluch, Marcin Mucha, Aleksander Mądry
APPROX-RANDOM 2009
arXiv DOI
Abstract
We give a deterministic combinatorial 7/9-approximation algorithm for the symmetric maximum traveling salesman problem
Deterministic 7/8-Approximation for the Metric Maximum TSP
Łukasz Kowalik, Marcin Mucha
APPROX-RANDOM 2008
Journal version in Theor. Comput. Sci. 410(47-49) 2009
DOI DOI
Abstract
We present the first 7/8-approximation algorithm for the maximum Traveling Salesman Problem (MAX-TSP) with triangle inequality. Our algorithm is deterministic. This improves over both the randomized algorithm of Hassin and Rubinstein with an expected approximation ratio of 7/8−O(n−1/2) and the deterministic (7/8−O(n−1/3))-approximation algorithm of Chen and Nagoya.
In the new algorithm, we extend the approach of processing local configurations using the so-called loose-ends, which we used earlier for the asymmetric maximum TSP with triangle inequality.
35/44-Approximation for Asymmetric Maximum TSP with Triangle Inequality
Łukasz Kowalik, Marcin Mucha
WADS 2007
Journal version in Algorithmica 59(2) 2011
DOI DOI
Abstract
We describe a new approximation algorithm for the asymmetric maximum traveling salesman problem (ATSP) with triangle inequality. Our algorithm achieves approximation factor 35/44 which improves on the previous 31/40 factor of Bläser, Ram and Sviridenko.
Maximum Matchings via Gaussian Elimination
Marcin Mucha, Piotr Sankowski
FOCS 2004 (Best Student Paper)
DOI
Abstract
We present randomized algorithms for finding maximum matchings in general and bipartite graphs. Both algorithms have running time O(nω), where ω is the exponent of the best known matrix multiplication algorithm. Since ω < 2.38, these algorithms break through the O(n2.5) barrier for the matching problem. They both have a very simple implementation in time O(n³) and the only non-trivial element of the O(nω) bipartite matching algorithm is the fast matrix multiplication algorithm.
Our results resolve a long-standing open question of whether Lovász’s randomized technique of testing graphs for perfect matching in time O(nω) can be extended to an algorithm that actually constructs a perfect matching.
Maximum Matchings in Planar Graphs via Gaussian Elimination
Marcin Mucha, Piotr Sankowski
ESA 2004 (Best Student Paper)
Journal Version in Algorithmica 45(1) 2006
DOI DOI
Abstract
We present a randomized algorithm for finding maximum matchings in planar graphs in time O(nω/2), where ω is the exponent of the best known matrix multiplication algorithm. Since ω<2.38, this algorithm breaks through the O(n1.5) barrier for the matching problem. This is the first result of this kind for general planar graphs. We also present an algorithm for generating perfect matchings in planar graphs uniformly at random using O(nω/2) arithmetic operations. Our algorithms are based on the Gaussian elimination approach to maximum matchings earlier used for general graphs.