Section: New Results
Distributed Computing
Robust Detection in LeakProne Population Protocols
In [10], we aim to design population protocols for the problem of detecting a signal in the presence of faults, motivated by scenarios of chemical computation. In contrast to electronic computation, chemical computation is noisy and susceptible to a variety of sources of error, which has prevented the construction of robust complex systems. To be effective, chemical algorithms must be designed with an appropriate error model in mind. Here we consider the model of chemical reaction networks that preserve molecular count (population protocols), and ask whether computation can be made robust to a natural model of unintended “leak” reactions. Our definition of leak is motivated by both the particular spurious behavior seen when implementing chemical reaction networks with DNA strand displacement cascades, as well as the unavoidable side reactions in any implementation due to the basic laws of chemistry. We develop a new “Robust Detection” algorithm for the problem of fast (logarithmic time) single molecule detection, and prove that it is robust to this general model of leaks. Besides potential applications in single molecule detection, the errorcorrection ideas developed here might enable a new class of robustbydesign chemical algorithms. Our analysis is based on a nonstandard hybrid argument, combining ideas from discrete analysis of population protocols with classic Markov chain techniques.
Minimizing Message Size in Stochastic Communication Patterns: Fast SelfStabilizing Protocols with 3 bits
In [12] we consider the basic PULL model of communication, in which in each round, each agent extracts information from few randomly chosen agents. We seek to identify the smallest amount of information revealed in each interaction (message size) that nevertheless allows for efficient and robust computations of fundamental information dissemination tasks. We focus on the Majority Bit Dissemination problem that considers a population of $n$ agents, with a designated subset of source agents. Each source agent holds an input bit and each agent holds an output bit. The goal is to let all agents converge their output bits on the most frequent input bit of the sources (the majority bit). Note that the particular case of a single source agent corresponds to the classical problem of Broadcast (also termed Rumor Spreading). We concentrate on the severe faulttolerant context of selfstabilization, in which a correct configuration must be reached eventually, despite all agents starting the execution with arbitrary initial states. In particular, the specification of who is a source and what is its initial input bit may be set by an adversary.
We first design a general compiler which can essentially transform any selfstabilizing algorithm with a certain property that uses $\ell $bits messages to one that uses only $log\ell $bits messages, while paying only a small penalty in the running time. By applying this compiler recursively we then obtain a selfstabilizing Clock Synchronization protocol, in which agents synchronize their clocks modulo some given integer $T$, within $\tilde{O}(lognlogT)$ rounds w.h.p., and using messages that contain 3 bits only.
We then employ the new Clock Synchronization tool to obtain a selfstabilizing Majority Bit Dissemination protocol which converges in $\tilde{O}(logn)$ time, w.h.p., on every initial configuration, provided that the ratio of sources supporting the minority opinion is bounded away from half. Moreover, this protocol also uses only 3 bits per interaction.
The ANTS Problem
In [6] we introduce the Ants Nearby Treasure Search (ANTS) problem, which models natural cooperative foraging behavior such as that performed by ants around their nest. In this problem, $k$ probabilistic agents, initially placed at a central location, collectively search for a treasure on the twodimensional grid. The treasure is placed at a target location by an adversary and the agents' goal is to find it as fast as possible as a function of both $k$ and $D$, where $D$ is the (unknown) distance between the central location and the target. We concentrate on the case in which agents cannot communicate while searching. It is straightforward to see that the time until at least one agent finds the target is at least $\Omega (D+{D}^{2}/k)$, even for very sophisticated agents, with unrestricted memory. Our algorithmic analysis aims at establishing connections between the time complexity and the initial knowledge held by agents (e.g., regarding their total number $k$), as they commence the search. We provide a range of both upper and lower bounds for the initial knowledge required for obtaining fast running time. For example, we prove that $loglogk+\Theta \left(1\right)$ bits of initial information are both necessary and sufficient to obtain asymptotically optimal running time, i.e., $O(D+{D}^{2}/k)$. We also we prove that for every $0<\u03f5<1$, running in time $O({log}^{1\u03f5}k\xb7(D+{D}^{2}/k))$ requires that agents have the capacity for storing $\Omega \left({log}^{\u03f5}k\right)$ different states as they leave the nest to start the search. To the best of our knowledge, the lower bounds presented in this paper provide the first nontrivial lower bounds on the memory complexity of probabilistic agents in the context of search problems.
We view this paper as a “proof of concept” for a new type of interdisciplinary methodology. To fully demonstrate this methodology, the theoretical tradeoff presented here (or a similar one) should be combined with measurements of the time performance of searching ants.
Breathe before Speaking: Efficient Information Dissemination despite Noisy, Limited and Anonymous Communication
Distributed computing models typically assume reliable communication between processors. While such assumptions often hold for engineered networks, e.g., due to underlying error correction protocols, their relevance to biological systems, wherein messages are often distorted before reaching their destination, is quite limited. In this study we take a first step towards reducing this gap by rigorously analyzing a model of communication in large anonymous populations composed of simple agents which interact through short and highly unreliable messages.
In [9] we focus on the broadcast problem and the majorityconsensus problem. Both are fundamental information dissemination problems in distributed computing, in which the goal of agents is to converge to some prescribed desired opinion. We initiate the study of these problems in the presence of communication noise. Our model for communication is extremely weak and follows the push gossip communication paradigm: In each round each agent that wishes to send information delivers a message to a random anonymous agent. This communication is further restricted to contain only one bit (essentially representing an opinion). Lastly, the system is assumed to be so noisy that the bit in each message sent is flipped independently with probability $1/2\u03f5$, for some small $\u03f5>0$.
Even in this severely restricted, stochastic and noisy setting we give natural protocols that solve the noisy broadcast and the noisy majorityconsensus problems efficiently. Our protocols run in $O(logn/{\u03f5}^{2})$ rounds and use $O(nlogn/{\u03f5}^{2})$ messages/bits in total, where $n$ is the number of agents. These bounds are asymptotically optimal and, in fact, are as fast and message efficient as if each agent would have been simultaneously informed directly by an agent that knows the prescribed desired opinion. Our efficient, robust, and simple algorithms suggest balancing between silence and transmission, synchronization, and majoritybased decisions as important ingredients towards understanding collective communication schemes in anonymous and noisy populations.
Parallel Search with no Coordination
In [23] we consider a parallel version of a classical Bayesian search problem. $k$ agents are looking for a treasure that is placed in one of finately many boxes according to a known distribution $p$. The aim is to minimize the expected time until the first agent finds it. Searchers run in parallel where at each time step each searcher can “peek” into a box. A basic family of algorithms which are inherently robust is noncoordinating algorithms. Such algorithms act independently at each searcher, differing only by their probabilistic choices. We are interested in the price incurred by employing such algorithms when compared with the case of full coordination.
We first show that there exists a noncoordination algorithm, that knowing only the relative likelihood of boxes according to $p$, has expected running time of at most $10+4{(1+\frac{1}{k})}^{2}T$, where $T$ is the expected running time of the best fully coordinated algorithm. This result is obtained by applying a refined version of the main algorithm suggested by Fraigniaud, Korman and Rodeh in STOC'16, which was designed for the context of linear parallel search.
We then describe an optimal noncoordinating algorithm for the case where the distribution $p$ is known. The running time of this algorithm is difficult to analyse in general, but we calculate it for several examples. In the case where $p$ is uniform over a finite set of boxes, then the algorithm just checks boxes uniformly at random among all nonchecked boxes and is essentially 2 times worse than the coordinating algorithm. We also show simple algorithms for Pareto distributions over $M$ boxes. That is, in the case where $p\left(x\right)\sim 1/{x}^{b}$ for $0<b<1$, we suggest the following algorithm: at step $t$ choose uniformly from the boxes unchecked in $\{1,...,min(M,\lfloor t/\sigma \rfloor \left)\right\}$, where $\sigma =b/(b+k1)$. It turns out this algorithm is asymptotically optimal, and runs about $2+b$ times worse than the case of full coordination.
Waitfree local algorithms
When considering distributed computing, reliable messagepassing synchronous systems on the one side, and asynchronous failureprone sharedmemory systems on the other side, remain two quite independently studied ends of the reliability/asynchrony spectrum. The concept of locality of a computation is central to the first one, while the concept of waitfreedom is central to the second one. In [2] we propose a new Decoupled model in an attempt to reconcile these two worlds. It consists of a synchronous and reliable communication graph of n nodes, and on top a set of asynchronous crashprone processes, each attached to a communication node. To illustrate the Decoupled model, the paper presents an asynchronous 3coloring algorithm for the processes of a ring. From the processes point of view, the algorithm is waitfree. From a locality point of view, each process uses information only from processes at distance $O(log*n)$ from it. This local waitfree algorithm is based on an extension of the classical Cole and Vishkin's vertex coloring algorithm in which the processes are not required to start simultaneously.
Immediate $t$resilient Snapshot
An immediate snapshot object is a high level communication object, built on top of a read/write distributed system in which all except one processes may crash. It allows each process to write a value and obtains a set of pairs (process id, value) such that, despite process crashes and asynchrony, the sets obtained by the processes satisfy noteworthy inclusion properties. Considering an $n$process model in which up to $t$ processes are allowed to crash, [14] is on the construction of $t$resilient immediate snapshot objects.
Decidability classes for mobile agents computing
In [7], we establish a classification of decision problems that are to be solved by mobile agents operating in unlabeled graphs, using a deterministic protocol. The classification is with respect to the ability of a team of agents to solve decision problems, possibly with the aid of additional information. In particular, our focus is on studying differences between the decidability of a decision problem by agents and its verifiability when a certificate for a positive answer is provided to the agents (the latter is to the former what NP is to P in the framework of sequential computing). We show that the class MAV of mobile agents verifiable problems is much wider than the class MAD of mobile agents decidable problems. Our main result shows that there exist natural MAVcomplete problems: the most difficult problems in this class, to which all problems in MAV are reducible via a natural mobile computing reduction. Beyond the class MAV we show that, for a single agent, three natural oracles yield a strictly increasing chain of relative decidability classes.
Distributed Detection of Cycles
Distributed property testing in networks has been introduced by Brakerski and PattShamir (2011), with the objective of detecting the presence of large dense subnetworks in a distributed manner. Recently, CensorHillel et al. (2016) have shown how to detect 3cycles in a constant number of rounds by a distributed algorithm. In a follow up work, Fraigniaud et al. (2016) have shown how to detect 4cycles in a constant number of rounds as well. However, the techniques in these latter works were shown not to generalize to larger cycles ${C}_{k}$ with $k\ge 5$. In [19], we completely settle the problem of cycle detection, by establishing the following result. For every $k\ge 3$, there exists a distributed property testing algorithm for ${C}_{k}$freeness, performing in a constant number of rounds. All these results hold in the classical CONGEST model for distributed network computing. Our algorithm is 1sided error. Its roundcomplexity is $O(1/\u03f5)$ where $\u03f5\in (0,1)$ is the property testing parameter measuring the gap between legal and illegal instances.
What Can Be Verified Locally?
In [18], we are considering distributed network computing, in which computing entities are connected by a network modeled as a connected graph. These entities are located at the nodes of the graph, and they exchange information by messagepassing along its edges. In this context, we are adopting the classical framework for local distributed decision, in which nodes must collectively decide whether their network configuration satisfies some given boolean predicate, by having each node interacting with the nodes in its vicinity only. A network configuration is accepted if and only if every node individually accepts. It is folklore that not every Turingdecidable network property (e.g., whether the network is planar) can be decided locally whenever the computing entities are Turing machines (TM). On the other hand, it is known that every Turingdecidable network property can be decided locally if nodes are running nondeterministic Turing machines (NTM). However, this holds only if the nodes have the ability to guess the identities of the nodes currently in the network. That is, for different sets of identities assigned to the nodes, the correct guesses of the nodes might be different. If one asks the nodes to use the same guess in the same network configuration even with different identity assignments, i.e., to perform identityoblivious guesses, then it is known that not every Turingdecidable network property can be decided locally.
We show that every Turingdecidable network property can be decided locally if nodes are running alternating Turing machines (ATM), and this holds even if nodes are bounded to perform identityoblivious guesses. More specifically, we show that, for every network property, there is a local algorithm for ATMs, with at most 2 alternations, that decides that property. To this aim, we define a hierarchy of classes of decision tasks where the lowest level contains tasks solvable with TMs, the first level those solvable with NTMs, and level $k$ contains those tasks solvable with ATMs with $k$ alternations. We characterize the entire hierarchy, and show that it collapses in the second level. In addition, we show separation results between the classes of network properties that are locally decidable with TMs, NTMs, and ATMs, and we establish the existence of completeness results for each of these classes, using novel notions of local reduction.
Certification of Compact LowStretch Routing Schemes
On the one hand, the correctness of routing protocols in networks is an issue of utmost importance for guaranteeing the delivery of messages from any source to any target. On the other hand, a large collection of routing schemes have been proposed during the last two decades, with the objective of transmitting messages along short routes, while keeping the routing tables small. Regrettably, all these schemes share the property that an adversary may modify the content of the routing tables with the objective of, e.g., blocking the delivery of messages between some pairs of nodes, without being detected by any node.
In [17], we present a simple certification mechanism which enables the nodes to locally detect any alteration of their routing tables. In particular, we show how to locally verify the stretch3 routing scheme by Thorup and Zwick [SPAA 2001] by adding certificates of $\tilde{O}\left(\sqrt{n}\right)$ bits at each node in $n$node networks, that is, by keeping the memory size of the same order of magnitude as the original routing tables. We also propose a new nameindependent routing scheme using routing tables of size $\tilde{O}\left(\sqrt{n}\right)$ bits. This new routing scheme can be locally verified using certificates on $\tilde{O}\left(\sqrt{n}\right)$ bits. Its stretch is 3 if using handshaking, and 5 otherwise.
ErrorSensitive ProofLabeling Schemes
Prooflabeling schemes are known mechanisms providing nodes of networks with certificates that can be verified locally by distributed algorithms. Given a boolean predicate on network states, such schemes enable to check whether the predicate is satisfied by the actual state of the network, by having nodes interacting with their neighbors only. Prooflabeling schemes are typically designed for enforcing faulttolerance, by making sure that if the current state of the network is illegal with respect to some given predicate, then at least one node will detect it. Such a node can raise an alarm, or launch a recovery procedure enabling the system to return to a legal state. We introduce errorsensitive prooflabeling schemes. These are prooflabeling schemes which guarantee that the number of nodes detecting illegal states is linearly proportional to the editdistance between the current state and the set of legal states. By using errorsensitive prooflabeling schemes, states which are far from satisfying the predicate will be detected by many nodes, enabling fast return to legality. In [20], we provide a structural characterization of the set of boolean predicates on network states for which there exist errorsensitive prooflabeling schemes. This characterization allows us to show that classical predicates such as, e.g., acyclicity, and leader admit errorsensitive prooflabeling schemes, while others like regular subgraphs don't. We also focus on compact errorsensitive prooflabeling schemes. In particular, we show that the known prooflabeling schemes for spanning tree and MST, using certificates on $O(logn)$ bits, and on $O\left({log}^{2}n\right)$ bits, respectively, are errorsensitive, as long as the trees are locally represented by adjacency lists, and not by a pointer to the parent.
Distributed Property Testing
In [16], we designed distributed testing algorithms of graph properties in the CONGEST model [CensorHillel et al. 2016], especially for testing subgraphfreeness. Testing a given property means that we have to distinguish between graphs having the property, and graphs that are $\u03f5$far from having it, meaning that one must remove an $\u03f5$fraction of the edges to obtain it. We established a series of results, among which:

Testing $H$freeness in a constant number of rounds, for any graph $H$ that can be transformed into a tree by removing a single edge. This includes, e.g., cyclefreeness for any constant cycle, and ${K}_{4}$freeness. As a byproduct, we give a deterministic CONGEST protocol determining whether a graph contains a fixed tree as a subgraph.

For cliques ${K}_{k}$ with $k\ge 5$, we show that ${K}_{k}$freeness can be tested in $O\left({\left(\frac{m}{\u03f5}\right)}^{\frac{1}{2}+\frac{1}{k2}}\right)$ rounds, where $m$ is the number of edges in the network graph.

We describe a general procedure for converting $\u03f5$testers with $f\left(D\right)$ rounds, where $D$ denotes the diameter of the graph, to work in $O\left(\right(logn)/\u03f5)+f\left(\right(logn)/\u03f5)$ rounds, where $n$ is the number of processors of the network. We then apply this procedure to obtain an $\u03f5$tester for testing whether a graph is bipartite.
These protocols extend and improve previous results of [CensorHillel et al. 2016] and [Fraigniaud et al. 2016].