ABSTRACT
We consider the problem of finding an unknown graph by using two types of queries with an additive property. Given a graph, an additive query asks the number of edges in a set of vertices while a cross-additive query asks the number of edges crossing between two disjoint sets of vertices. The queries ask sum of weights for the weighted graphs. These types of queries were partially motivated in DNA shotgun sequencing and linkage discovery problem of artificial intelligence.
For a given unknown weighted graph G with n vertices, m edges, and a certain mild condition on weights, we prove that there exists a non-adaptive algorithm to find the edges of G using O(m log n / log m) queries of both types provided that m ≥ nε for any constant ε > 0. For an unweighted graph, it is shown that the same bound holds for all range of m.
This settles a conjecture of Grebinski [23] for finding an unweighted graph using additive queries. We also consider the problem of finding the Fourier coefficients of a certain class of pseudo-Boolean functions. A similar coin weighing problem is also considered.
- M. Aigner. Combinatorial Search. Wiley, New York, 1988.]] Google ScholarDigital Library
- N. Alon and V. Asodi. Learning a hidden subgraph. In Proceedings of the 31st International Colloquium on Automata, Languages and Programming (ICALP 2004), pages 110--121, 2004.]]Google ScholarCross Ref
- N. Alon and V. Asodi. Learning a hidden subgraph. SIAM Journal on Discrete Mathematics, 18(4):697--712, 2005.]] Google ScholarDigital Library
- N. Alon, R. Beigel, S. Kasif, S. Rudich, and B. Sudakov. Learning a hidden matching. In Proceedings of the 43rd Annual IEEE Symposium on Foundations of Computer Science (FOCS 2002), pages 197--206, 2002.]] Google ScholarDigital Library
- N. Alon, R. Beigel, S. Kasif, S. Rudich, and B. Sudakov. Learning a hidden matching. SIAM Journal on Computing, 33(2):487--501, 2004.]] Google ScholarDigital Library
- D. Angluin and J. Chen. Learning a hidden graph using O(log n) queries per edge. In Proceedings of the 17th Annual Conference on Learning Theory (COLT 2004), pages 210--223, 2004.]]Google ScholarCross Ref
- D. Angluin and J. Chen. Learning a hidden hypergraph. Journal of Machine Learning Research, 7:2215--2236, 2006.]] Google ScholarDigital Library
- A. L. Barabäsi and Z. N. Oltvai. Network biology: Understanding the cell's functional organization. Nature Reviews Genetics, 5:101--113, 2004.]]Google ScholarCross Ref
- R. Beigel, N. Alon, M. S. Apaydin, L. Fortnow, and S. Kasif. An optimal procedure for gap closing in whole genome shotgun sequencing. In Proceedings of the Fifth Annual International Conference on Computational Molecular Biology (RECOMB 2001), pages 22--30, 2001.]] Google ScholarDigital Library
- M. Bouvel, V. Grebinski, and G. Kucherov. Combinatorial search on graphs motivated by bioinformatics applications: A brief survey. In the 31st International Workshop on Graph-Theoretic Concepts in Computer Science (WG 2005), pages 16--27, 2005.]] Google ScholarDigital Library
- N. H. Bshouty and C. Tamon. On the Fourier spectrum of monotone functions. Journal of the ACM, 43(4):747--770, 1996.]] Google ScholarDigital Library
- H. Chernoff. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Annals of Mathematical Statistics, 23:493--509, 1952.]]Google ScholarCross Ref
- S. S. Choi, K. Jung, and J. H. Kim. Almost tight upper bound for finding Fourier coefficients of k-bounded pseudo-Boolean functions. Submitted to the 21st Annual Conference on Learning Theory (COLT 2008), 2008.]]Google Scholar
- S. S. Choi, K. Jung, and B. R. Moon. Lower and upper bounds for linkage discovery. IEEE Trans. on Evolutionary Computation, 2008. In revision.]] Google ScholarDigital Library
- S. S. Choi, Y. K. Kwon, and B. R. Moon. Properties of symmetric fitness functions. IEEE Trans. on Evolutionary Computation, 11(6):743--757, 2007.]] Google ScholarDigital Library
- P. Erdos. On a lemma of Littlewood and Offord. Bulletin of the American Mathematical Society, 51:898--902, 1945.]]Google ScholarCross Ref
- C. G. Esseen. On the Kolmogorov-Rogozin inequality for the concentration function. Z. Wahrscheinlichkeitstheorie verw. Geb., 5:210--216, 1966.]]Google ScholarCross Ref
- C. G. Esseen. On the concentration function of a sum of independent random variables. Z. Wahrscheinlichkeitstheorie verw. Geb., 9:290--308, 1968.]]Google ScholarCross Ref
- W. Ewens. Mathematical Population Genetics. Springer Verlag, 1979.]]Google Scholar
- I. Franklin and R. Lewontin. Is the gene the unit of selection ? Genetics, 65:707--734, 1970.]]Google ScholarCross Ref
- M. R. Garey, D. S. Johnson, and L. Stockmeyer. Some simplified NP-complete graph problems. Theoretical Computer Science, 1:237--267, 1976.]]Google ScholarCross Ref
- D. E. Goldberg. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison Wesley, 1989.]] Google ScholarDigital Library
- V. Grebinski. On the power of additive combinatorial search model. In Proceedings of the 4th Annual International Conference on Computing and Combinatorics (COCOON 1998), pages 194--203, 1998.]] Google ScholarDigital Library
- V. Grebinski and G. Kucherov. Optimal query bounds for reconstructing a Hamiltonian cycle in complete graphs. In the Fifth Israel Symposium on the Theory of Computing Systems (ISTCS 1997), pages 166--173, 1997.]] Google ScholarDigital Library
- V. Grebinski and G. Kucherov. Optimal reconstruction of graphs under the additive model. In Proceedings of the 5th Annual European Symposium on Algorithms (ESA 1997), pages 246--258, 1997.]] Google ScholarDigital Library
- V. Grebinski and G. Kucherov. Reconstructing a Hamiltonian cycle by querying the graph: Application to DNA physical mapping. Discrete Applied Mathematics, 88:147--165, 1998.]] Google ScholarDigital Library
- V. Grebinski and G. Kucherov. Optimal reconstruction of graphs under the additive model. Algorithmica, 28:104--124, 2000.]]Google ScholarDigital Library
- R. B. Heckendorn and A. H. Wright. Efficient linkage discovery by limited probing. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2003), pages 1003--1014, 2003.]] Google ScholarDigital Library
- R. B. Heckendorn and A. H. Wright. Efficient linkage discovery by limited probing. Evolutionary Computation, 12(4):517--545, 2004.]] Google ScholarDigital Library
- J. Jackson. An efficient membership-query algorithm for learning DNF with respect to the uniform distribution. Journal of Computer and System Sciences, 55(3):42--65, 1997.]] Google ScholarDigital Library
- H. Kargupta and B. Park. Gene expression and fast construction of distributed evolutionary representation. Evolutionary Computation, 9(1):1--32, 2001.]] Google ScholarDigital Library
- S. A. Kauffman. Adaptation on rugged fitness landscapes. In D. Stein, editor, Lectures in the Sciences of Complexity, pages 527--618. Addison Wesley, 1989.]]Google Scholar
- S. A. Kauffman and S. Levin. Towards a general theory of adaptive walks on rugged landscapes. Journal of Theoretical Biology, 128:11--45, 1987.]]Google ScholarCross Ref
- R. Lewontin. The Genetic Basis of Evolutionary Change. Columbia University Press, 1974.]]Google Scholar
- B. Lindström. On B2-sequences of vectors. Journal of Number Theory, 4:261--265, 1972.]]Google Scholar
- B. Lindström. Determining subsets by unramified experiments. In J. N. Srivastava, editor, A Survey of Statistical Designs and Linear Models, pages 407--418. North Holland, 1975.]]Google Scholar
- J. E. Littlewood and A. C. Offord. On the number of real roots of a random algebraic equation. III. Mat. Sbornik, 12:277--285, 1943.]]Google Scholar
- C. A. Macken and A. S. Perelson. Protein evolution on rugged landscapes. In Proceedings of the National Academic of Science, USA, volume 86, pages 6191--6195, 1989.]]Google ScholarCross Ref
- Y. Mansour. Learning Boolean functions via the Fourier transform. In V. Roychowdhury, K. Y. Siu, and A. Orlitsky, editors, Theoretical Advances in Neural Computation and Learning, pages 391--424. Kluwer Academic, 1994.]]Google ScholarCross Ref
- H. Mühlenbein and T. Mahnig. FDA -- A scalable evolutionary algorithm for the optimization of additively decomposed functions. Evolutionary Computation, 7(1):45--68, 1999.]] Google ScholarDigital Library
- M. Pelikan, D. E. Goldberg, and E. Cantü-Paz. Linkage problem, distribution estimation, and Bayesian networks. Evolutionary Computation, 8(3):311--340, 2000.]] Google ScholarDigital Library
- L. Reyzin and N. Srivastava. Learning and verifying graphs using queries with a focus on edge counting. In Proceedings of the 18th International Conference on Algorithmic Learning Theory (ALT 2007), pages 285--297, 2007.]] Google ScholarDigital Library
- B. A. Rogozin. An estimate for concentration functions. Theory of Probability and its Applications, 6(1):94--97, 1961.]]Google ScholarCross Ref
- B. A. Rogozin. On the increase of dispersion of sums of independent random variables. Theory of Probability and its Applications, 6(1):97--99, 1961.]]Google ScholarCross Ref
- H. S. Shapiro and S. Söderberg. A combinatory detection problem. American Mathematical Monthly, 70:1066--1070, 1963.]]Google ScholarCross Ref
- M. J. Streeter. Upper bounds on the time and space complexity of optimizing additively separable functions. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2004), pages 186--197, 2004.]]Google ScholarCross Ref
- H. Tettelin, D. Radune, S. Kasif, H. Khouri, and S. L. Salzberg. Optimized multiplex PCR: Efficiently closing a whole-genome shotgun sequencing project. Genomics, 62:500--507, 1999.]]Google ScholarCross Ref
- D. Thieffry, A. M. Huerta, E. Perez-Rueda, and J. Collado-Vides. From specific gene regulation to genomic networks: A global analysis of transcriptional regulation in Escherichia coli. BioEssays, 20(5):433--440, 1998.]]Google ScholarCross Ref
- J. L. Walsh. A closed set of orthogonal functions. American Journal of Mathematics, 55:5--24, 1923.]]Google ScholarCross Ref
Index Terms
- Optimal query complexity bounds for finding graphs
Recommendations
Optimal query complexity bounds for finding graphs
We consider the problem of finding an unknown graph by using queries with an additive property. This problem was partially motivated by DNA shotgun sequencing and linkage discovery problems of artificial intelligence. Given a graph, an additive query ...
Toward a deterministic polynomial time algorithm with optimal additive query complexity
In this paper, we study two combinatorial search problems: the coin weighing problem with a spring scale (also known as the vector reconstructing problem using additive queries) and the problem of reconstructing weighted graphs using additive queries. ...
Reconstructing weighted graphs with minimal query complexity
In this paper, we consider the problem of reconstructing a hidden weighted graph using additive queries. We prove the following. Let G be a weighted hidden graph with n vertices and m edges such that the weights on the edges are bounded between n^-^a ...
Comments