Efficient algorithms for locating the length-constrained heaviest segments with applications to biomolecular sequence analysis
Resource
Journal of Computer & System Sciences 65,570-586
Journal
Journal of Computer and System Sciences
Journal Volume
65
Journal Issue
3
Pages
570 - 586
Date Issued
2002
Date
2002
Author(s)
DOI
20060927122844992215
Abstract
We study two fundamental problems concerning the search for interesting regions in
sequences:(i)given a sequence of real numbers of length n and an upper bound U摯瑬敳獩 find a
consecutive subsequence of lengthat most U withth maximum sum and (ii)given a sequence
of real numbers of length n and a lower bound L摯瑬敳獩 find a consecutive subsequence of lengthat
least L withth maximum average.We present an O摯瑬敳獩 n捡牯渀 ⴀ琀椀洀攀 愀氀最漀爀椀琀栀洀 昀漀爀 琀栀攀Ā爀猀琀 瀀爀漀戀氀攀洀ഀ愀渀搀 愀渀 佤潴汥獳椀 渀 氀漀最 䱣慲潮 -time algorithm for the second.The algorithms have potential applications in
several areas of biomolecular sequence analysis including locating GC-richregions in a
genomic DNA sequence,post-processing sequence alignments,annotating multiple sequence
alignments,and computing length-constrained ungapped local alignment.Our preliminary
tests on both simulated and real data demonstrate that the algorithms are very ef ficient and
able to locate useful (suchas GC-rich)regions.
Subjects
Algorithm
Efficiency
Maximum consecutive subsequence
Length constraint
Biomolecular sequence analysis
Ungapped local alignment
Type
journal article
File(s)![Thumbnail Image]()
Loading...
Name
2002_jcss.pdf
Size
305.94 KB
Format
Adobe PDF
Checksum
(MD5):7de9d6159c478c84eaf3cb11c6974707
