Abstract
Given a sequence of n real numbers A = (a 1,a 2,...,a n ), two integers L and U with 1 ≤ L ≤ U ≤ n, and a score function f:IR + ×IR→IR, the Length-Constrained Max-Score Segment Problem is to find a segment A[i,j] = (a i ,a i + 1,...,a j ) maximizing \(f(j-i+1,\sum_{h=i}^ja_h)\) subject to j − i + 1 ∈ [L,U]. In this paper, we solve the Length-Constrained Max-Score Segment Problem for the case where the given score function \(f(\ell,w)=\frac{w}{\sqrt[r]{\ell}}\) for any constant r > 1. Our algorithm runs in \(O(n\frac{T(L^{1/2})}{L^{1/2}})\) time, where T(n′) is the time required to solve the all-pairs shortest paths problem on a graph of n′ nodes. By the latest result of Chan [7], \(T(n')=O(n'^3 \frac{(\log\log n')^3}{(\log n')^2})\), so our algorithm runs in subquadratic time \(O(nL\frac{(\log\log L)^3}{(\log L)^2})\). Lipson et al. [21] studied a more restricted case where the score function \(f(\ell,w)=\frac{w}{\sqrt[2]{\ell}}\) and there are no length constraints, i.e., L = 1 and U = n. They also showed how to apply their algorithm to analyzing DNA copy number data. However, their algorithm takes Ω(n 2) time in the worst situation. Since the length lower bound L for the case considered by Lipson et al. is a constant, our algorithm solves it in O(n) time.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aho, A., Hopcroft, J., Ullman, J.: The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading (1974)
Bellman, R., Karush, W.: Mathematical Programming and the Maximum Transform. Journal of the Society for Industrial and Applied Mathematics 10(3), 550–567 (1962)
Bergkvist, A., Damaschke, P.: Fast Algorithms for Finding Disjoint Subsequences with Extremal Densities. Pattern Recognition 39(12), 2281–2292 (2006)
Bernholt, T., Eisenbrand, F., Hofmeister, T.: A Geometric Framework for Solving Subsequence Problems in Computational Biology Efficiently. In: SoCG, pp. 310–318 (2007)
Bernholt, T., Hofmeister, T.: An Algorithm for a Generalized Maximum Subsequence Problem. In: Correa, J.R., Hevia, A., Kiwi, M. (eds.) LATIN 2006. LNCS, vol. 3887, pp. 178–189. Springer, Heidelberg (2006)
Bremner, D., Chan, T., Demaine, E., Erickson, J., Hurtado, F., Iacono, J., Langerman, S., Streinu, I., Taslakian, P.: Necklaces, Convolutions, and X+Y. In: Azar, Y., Erlebach, T. (eds.) ESA 2006. LNCS, vol. 4168, pp. 160–171. Springer, Heidelberg (2006)
Chan, T.M.: More Algorithms for All-Pairs Shortest Paths in Weighted Graphs. In: STOC (to appear, 2007)
Chung, K.-M., Lu, H.-I.: An Optimal Algorithm for the Maximum-Density Segment Problem. SIAM Journal on Computing 34(2), 373–387 (2004)
de Berg, M., van Kreveld, M., Overmars, M., Schwarzkopf, O.: Computational Geometry: Algorithms and Applications, 2nd edn. Springer, Heidelberg (2000)
Fan, T.-H., Lee, S., Lu, H.-I., Tsou, T.-S., Wang, T.-C., Yao, A.: An Optimal Algorithm for Maximum-Sum Segment and Its Application in Bioinformatics Extended Abstract. In: Ibarra, O.H., Dang, Z. (eds.) CIAA 2003. LNCS, vol. 2759, pp. 251–257. Springer, Heidelberg (2003)
Felzenszwalb, P., Huttenlocher, D.: Distance Transforms of Sampled Functions. Technical Report TR2004-1963, Cornell Computing and Information Science (2004)
Feuk, L., Carson, A.R., Scherer, S.W.: Structural variation in the human genome. Nature Reviews Genetics 7, 85–97 (2006)
Goldwasser, M., Kao, M.-Y., Lu, H.-I.: Linear-Time Algorithms for Computing Maximum-Density Sequence Segments with Bioinformatics Applications. Journal of Computer and System Sciences 70(2), 128–144 (2005)
Han, Y.: An O(n 3 (loglogn/ logn)5/4 ) Time Algorithm for All Pairs Shortest Path. In: Azar, Y., Erlebach, T. (eds.) ESA 2006. LNCS, vol. 4168, pp. 411–417. Springer, Heidelberg (2006)
Hogg, R.V., Tanis, E.A.: Probability and Statistical Inference, 7th edn. (2005)
Huang, X.: An Algorithm for Identifying Regions of a DNA Sequence that Satisfy a Content Requirement. Computer Applications in the Biosciences 10(3), 219–225 (1994)
Iafrate, A J., Feuk, L., Rivera, M.N., Listewnik, M.L, Donahoe, P.K, Qi, Y., Scherer, S.W, Lee, C.: Detection of Large-Scale Variation in the Human Genome. Nature Genetics 36(9), 949–951 (2004)
Kim, S.K.: Linear-Time Algorithm for Finding a Maximum-Density Segment of a Sequence. Information Processing Letters 86(6), 339–342 (2003)
Komura, D., Shen, F., Ishikawa, S., Fitch, K.R., Chen, W., Zhang, J., Liu, G., Ihara, S., Nakamura, H., Hurles, M.E., et al.: Genome-wide Detection of Human Copy Number Variations Using High-Density DNA Oligonucleotide Arrays. Genome Research 16(12), 1575–1584 (2006)
Lin, Y.-L., Jiang, T., Chao, K.-M.: Efficient Algorithms for Locating the Length-Constrained Heaviest Segments with Applications to Biomolecular Sequence Analysis. Journal of Computer and System Sciences 65(3), 570–586 (2002)
Lipson, D., Aumann, Y., Ben-Dor, A., Linial, N., Yakhini, Z.: Efficient Calculation of Interval Scores for DNA Copy Number Data Analysis. In: McLysaght, A., Huson, D.H. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3678, pp. 83–100. Springer, Heidelberg (2005)
Maragos, P.: Differential Morphology. Nonlinear Image Processing, 289–329 (2000)
Moreau, J.-J.: Inf-Convolution, Sous-Additivité, Convexité Des Fonctions Numériques. Journal de Mathématiques Pures et Appliquées 49, 109–154 (1970)
Pinkel, D., Segraves, R., Sudar, D., Clark, S., Poole, I., Kowbel, D., Collins, C., Kuo, W.-L., Chen, C., Zhai, Y., et al.: High Resolution Analysis of DNA Copy Number Variation Using Comparative Genomic Hybridization to Microarrays. Nature Genetics 20(2), 207–211 (1998)
Pollack, J.R., Perou, C.M., Alizadeh, A.A., Eisen, M.B., Pergamenschikov, A., Williams, C.F., Jeffrey, S.S., Botstein, D., Brown, P.O.: Genome-wide analysis of DNA copy-number changes using cDNA microarrays. Nature Genetics 23(1), 41–46 (1999)
Redon, R., Ishikawa, S., Fitch, K.R., Feuk, L., Perry, G.H., Andrews, T.D., Fiegler, H., Shapero, M.H., Carson, A.R., Chen, W.: Global Variation in Copy Number in the Human Genome. Nature 444, 444–454 (2006)
Rockafellar, R.T.: Convex Analysis (1970)
Sebat, J., Lakshmi, B., Troge, J., Alexander, J., Young, J., Lundin, P., Måner, S., Massa, H., Walker, M., Chi, M., et al.: Large-Scale Copy Number Polymorphism in the Human Genome. Science 305(23), 525–528 (2004)
Strömberg, T.: The Operation of Infimal Convolution. Dissertationes Mathematicae 352, 58 (1996)
Takaoka, T.: Efficient Algorithms for the Maximum Subarray Problem by Distance Matrix Multiplication. Electronic Notes in Theoretical Computer Science 61, 191–200 (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, HF., Chen, PA., Chao, KM. (2007). Algorithms for Computing the Length-Constrained Max-Score Segments with Applications to DNA Copy Number Data Analysis. In: Tokuyama, T. (eds) Algorithms and Computation. ISAAC 2007. Lecture Notes in Computer Science, vol 4835. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77120-3_72
Download citation
DOI: https://doi.org/10.1007/978-3-540-77120-3_72
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77118-0
Online ISBN: 978-3-540-77120-3
eBook Packages: Computer ScienceComputer Science (R0)