, Protein representation Group: Core::CSB / Package: Protein representation
, classes Pack: Protein representation / Class: Polypeptide chain representation Gives access to a number of high level accessors and iterators to manipulate a polypeptide chain. Allows access to the three structures described in the previous sub-section in a unique class. Pack: Protein representation / Class: Protein representation Gives access to a number of specified polypeptide chains from a protein quaternary structure, The SBL provides many applications which rely on different structures tied to a polypeptide chain: ? topological information i.e. the covalent bondsGroup: Core::CSB / Package: Molecular covalent structure
, Molecular distances / Class: SBL::CSB::RMSD comb for motifs We provide a new class for the Molecular distances package. Given a set of structural motifs, this class builds the motif graph
, Group: SBL::Applications / Package: Molecular distances flexible We provide an application which, given a set of polypeptide chains as well as "subdomain" definitions (labeled residue ranges), computes the RMSD Comb.. The specification of labels is provided from SBL::Models::MolecularSystemLabelTraits. Example specification files can be found in the documentation. The application provides three executables: ? sbl-flexible-rmsd-proteins, Molecular distances / Class: SBL::Modules::RMSD comb for motifs module We provide the module enabling the use of the previous class in a workflow
, ? sbl-flexible-rmsd-conformations.exe is used to compare conformations of an identical protein ? sbl-flexible-rmsd-motifs.exe is used to compute the RMSD Comb. of two chains with user specified structural motifs
, Pre-requisites Following the contributions from ADDREF, we provide a novel package in the SBL. Given two polypeptide chains, the goal of this package is to identify structural motifs using any of the four methods from ADDREF, SBL::Applications / Package: Structural motifs Bibliography
DSMK means density-based split-and-merge k-means clustering algorithm, Journal of Artificial Intelligence and Soft Computing Research, vol.3, issue.1, pp.51-71, 2013. ,
Triangulating the surface of a molecule, Discrete Appl. Math, vol.71, pp.5-22, 1996. ,
Least-square fitting of two 3D point sets, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.9, issue.5, pp.698-700, 1987. ,
Maximum Contact Map Overlap Revisited, J. of Computational Biology, vol.18, issue.1, pp.1-15, 2011. ,
URL : https://hal.archives-ouvertes.fr/inria-00536624
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, NAR, vol.25, issue.17, pp.3389-3402, 1997. ,
Power diagrams: properties, algorithms and applications, SIAM J. Comput, vol.16, pp.78-96, 1987. ,
Minimizing proteome redundancy in the uniprot knowledgebase, ACMSODA, page 1035, 2007. ,
Haemoglobin: the structural changes related to ligand binding and its allosteric mechanism, JMB, vol.129, issue.2, pp.175-220, 1979. ,
Scoring hidden markov models, Computer applications in the biosciences, vol.13, pp.191-199, 1997. ,
Announcing the worldwide protein data bank, vol.10, 2003. ,
The median procedure for partitions. Partitioning data sets, vol.19, pp.3-34, 1993. ,
Matching protein structures with fuzzy alignments, Journal of computational and graphical statistics, vol.100, issue.21, pp.332-353, 2003. ,
Universal similarity measure for comparing protein structures, Biopolymers, vol.59, issue.5, pp.305-309, 2001. ,
Structure of a flavivirus envelope glycoprotein in its low-ph-induced membrane fusion conformation, The EMBO journal, vol.23, issue.4, pp.728-738, 2004. ,
, , 1998.
Algorithmic geometry, 1998. ,
Protein structure alignment considering phenotypic plasticity, Bioinformatics, vol.24, issue.16, pp.98-104, 2008. ,
The Structural Bioinformatics Library: modeling in biomolecular science and beyond, Bioinformatics, vol.7, issue.33, pp.1-8, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01379635
Persistence-based clustering in riemannian manifolds, J. ACM, vol.60, issue.6, pp.1-38, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-01094872
A comprehensive review and comparison of different computational methods for protein remote homology detection, Briefings in bioinformatics, vol.19, issue.2, pp.231-244, 2016. ,
Mean shift, mode seeking, and clustering, IEEE PAMI, vol.17, issue.8, pp.790-799, 1995. ,
How to find the best approximation results-a follow-up to garey and johnson, ACM SIGACT News, vol.29, issue.4, pp.90-97, 1998. ,
Computing the volume of union of balls: a certified algorithm, ACM Transactions on Mathematical Software, vol.38, issue.1, pp.1-20, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00849809
Introduction to algorithms, 2009. ,
Comparing two clusterings using matchings between clusters of clusters, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01514872
Approximation algorithms and hardness results for the clique packing problem, Disc. Appl. Math, vol.157, issue.7, pp.1396-1406, 2009. ,
Stability of persistence diagrams, Discrete & Computational Geometry, vol.37, issue.1, pp.103-120, 2007. ,
Elements of Information Theory, 2006. ,
Characterizing molecular flexibility by combining lRMSD measures, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01968175
Multiscale analysis of structurally conserved motifs, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01968176
A survey of hemoglobin quaternary structures, Proteins: Structure, Function, and Bioinformatics, vol.79, issue.10, pp.2861-2870, 2011. ,
New results on maximum induced matchings in bipartite graphs and beyond, Theoretical Computer Science, vol.478, pp.33-40, 2013. ,
Boltzmann samplers for the random generation of combinatorial structures, Combinatorics, Probability and Computing, vol.13, issue.45, pp.577-625, 2004. ,
URL : https://hal.archives-ouvertes.fr/hal-00307530
Pattern classification and scene analysis, 1973. ,
Performance criteria for graph clustering and markov cluster experiments, 2000. ,
Profile hidden markov models, Bioinformatics, vol.14, issue.9, pp.755-763, 1998. ,
A probabilistic model of local sequence alignment that simplifies statistical significance estimation, PLoS Comput Biol, vol.4, issue.5, p.1000069, 2008. ,
HMMER user's guide. biological sequence analysis using profile hidden markov models, 2015. ,
Weighted alpha shapes, Dept. Comput. Sci., Univ. Illinois, 1992. ,
The union of balls and its dual shape, Discrete Comput. Geom, vol.13, pp.415-440, 1995. ,
Muscle: multiple sequence alignment with high accuracy and high throughput, Nucleic Acids Research, vol.32, issue.5, pp.1792-1797, 2004. ,
DOI : 10.1093/nar/gkh340
URL : http://europepmc.org/articles/pmc390337?pdf=render
Computational topology: an introduction, 2010. ,
Is cooperative oxygen binding by hemoglobin really understood? Rendiconti Lincei, vol.17, pp.147-162, 2006. ,
DOI : 10.1007/bf02904506
HMMER web server: interactive sequence similarity searching, NAR, p.367, 2011. ,
DOI : 10.1093/nar/gkr367
URL : https://academic.oup.com/nar/article-pdf/39/suppl_2/W29/7628106/gkr367.pdf
The ncbi taxonomy database, Nucleic acids research, vol.40, issue.D1, pp.136-143, 2012. ,
Structure and Mechanism in Protein Science: A Guide to Enzyme Catalysis and Protein Folding, 1999. ,
Evolutionary diversification of the HAP2 membrane insertion motifs to drive gamete fusion across eukaryotes, PLoS Biology, 2018. ,
Data clustering using evidence accumulation, Proceedings. 16th International Conference on, vol.4, pp.276-280, 2002. ,
DOI : 10.1109/icpr.2002.1047450
URL : http://www.cse.msu.edu/prip/Files/AFred_AJain_ICPR2002.pdf
The ancient gamete fusogen hap2 is a eukaryotic class ii fusion protein, Cell, vol.168, issue.5, pp.904-915, 2017. ,
Analytic combinatorics, 2009. ,
DOI : 10.1017/cbo9780511801655
URL : https://hal.archives-ouvertes.fr/inria-00072739
Fibonacci heaps and their uses in improved network optimization algorithms, J. ACM, vol.34, issue.3, pp.596-615, 1987. ,
A generative cell specific 1 ortholog in drosophila melanogaster, 2012. ,
The envelope proteins of the bunyavirales, Advances in Virus Research, vol.98, pp.83-118, 2017. ,
A polynomial algorithm for the k-cut problem for fixed k, Mathematics of operations research, vol.19, 1994. ,
Algorithmic aspects of protein structure similarity, Foundations of Computer Science, 1999. 40th Annual Symposium on, pp.512-521, 1999. ,
Concrete mathematics: a foundation for computer science, 1989. ,
Comprehensive assessment of automatic structural alignment against a manual standard, the scop classification of proteins, Protein Science, vol.7, issue.2, pp.445-456, 1998. ,
, Handbook of Discrete and Computationnal Geometry, 2017.
Protein geometry: volumes, areas, and distances, The international tables for crystallography, pp.531-539, 2001. ,
Flexible algorithm for direct multiple alignment of protein structures and sequences, Bioinformatics, vol.10, issue.6, p.587, 1994. ,
Fast protein fragment similarity scoring using a Binet-Cauchy kernel, Bioinformatics, vol.30, issue.6, pp.784-791, 2014. ,
Viral membrane fusion. Virology, pp.498-507, 2015. ,
Advances and pitfalls of protein structural alignment, Current opinion in structural biology, vol.19, issue.3, pp.341-348, 2009. ,
Protein structure comparison by alignment of distance matrices, Journal of molecular biology, vol.233, issue.1, pp.123-138, 1993. ,
Dali: a network tool for protein structure comparison, Trends in biochemical sciences, vol.20, issue.11, pp.478-480, 1995. ,
Data clustering: 50 years beyond k-means, Pattern recognition letters, vol.31, issue.8, pp.651-666, 2010. ,
Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods, JMB, vol.284, issue.4, pp.1201-1210, 1998. ,
A solution for the best rotation to relate two sets of vectors, Acta Crystallographica Section A, vol.32, issue.5, pp.922-923, 1976. ,
Maximum bounded 3-dimensional matching is MAX SNP-complete, Inf. Process. Lett, vol.37, issue.1, pp.27-35, 1991. ,
Reducibility among combinatorial problems, Complexity of Computer Computations, pp.85-103, 1972. ,
Hidden Markov Models in computational biology: Applications to protein modeling, Bioinformatics, vol.14, issue.10, pp.1501-1531, 1994. ,
Unit-vector rms (urms) as a tool to analyze molecular dynamics trajectories, Proteins: Structure, Function, and Bioinformatics, vol.37, issue.4, pp.554-564, 1999. ,
Mechanisms of virus membrane fusion proteins, Ann. Rev. Virol, vol.1, pp.171-89, 2014. ,
A combined transmembrane topology and signal peptide prediction method, Journal of molecular biology, vol.338, issue.5, pp.1027-1036, 2004. ,
Calibrating e-values for hidden markov models using reverse-sequence null models, Bioinformatics, vol.21, issue.22, pp.4107-4115, 2005. ,
Approximate protein structural alignment in polynomial time, vol.101, pp.12201-12206, 2004. ,
Review on determining number of cluster in k-means clustering, International Journal, vol.1, issue.6, pp.90-95, 2013. ,
Virus membrane-fusion proteins: more than one way to make a hairpin, Nature Reviews Microbiology, vol.4, issue.1, pp.67-76, 2006. ,
Algorithm design. Pearson Education India, 2006. ,
On clusterings: Good, bad and spectral, Journal of the ACM (JACM), vol.51, issue.3, pp.497-515, 2004. ,
Fast and effective text mining using linear-time document clustering, ACM SIGKDD, pp.16-22, 1999. ,
TOPOFIT-DB, a database of protein structural alignments based on the topofit method, Nucleic acids research, vol.35, issue.1, pp.317-321, 2006. ,
Fast determination of the optimal rotational matrix for macromolecular superpositions, Journal of computational chemistry, vol.31, issue.7, pp.1561-1563, 2010. ,
Emdatabank unified data resource for 3dem, Growth of novel protein structural data, vol.104, pp.3183-3188, 2007. ,
Predicting protein function from sequence and structure, Nature Reviews Molecular Cell Biology, vol.8, issue.12, pp.995-1005, 2007. ,
Clustering Stability, 2010. ,
ProFunc: a server for predicting protein function from 3d structure, Nucleic acids research, vol.33, issue.2, pp.89-93, 2005. ,
Combining multiple clusterings by soft correspondence, IEEE Int'l Conf. on Data Mining, 2005. ,
Statistical significance in biological sequence analysis, Briefings in Bioinformatics, vol.7, issue.1, pp.2-24, 2006. ,
lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests, Bioinformatics, vol.29, issue.21, pp.2722-2728, 2013. ,
Size-independent comparison of protein three-dimensional structures, Proteins: Structure, Function, and Bioinformatics, vol.22, issue.3, pp.273-283, 1995. ,
Maximum clique in protein structure comparison, 9th International Symposium on Experimental Algorithms, pp.106-117, 2010. ,
Comparative analysis of protein structure alignments, BMC Structural Biology, vol.7, issue.1, p.50, 2007. ,
Comparing clusterings, 2002. ,
Sequence alignment as hypothesis testing, Journal of computational biology, vol.18, issue.5, pp.677-691, 2011. ,
Diffusion maps, clustering and fuzzy markov modeling in peptide folding transitions, The Journal of chemical physics, vol.141, issue.11, pp.9-611, 2014. ,
RapidRMSD: Rapid determination of RMSDs corresponding to motions of flexible molecules, Mathematical Programming, vol.88, issue.3, pp.507-520, 2000. ,
A general method applicable to the search for similarities in the amino acid sequence of two proteins, Journal of Molecular Biology, vol.48, issue.3, pp.443-453, 1970. ,
A normalized root-mean-spuare distance for comparing protein three-dimensional structures, Protein science, vol.10, issue.7, pp.1470-1473, 2001. ,
CAD-score: A new contact area differencebased function for evaluation of protein structural models, Proteins: Structure, Function, and Bioinformatics, vol.81, issue.1, pp.149-162, 2013. ,
Empirical statistical estimates for sequence similarity searches, Journal of molecular biology, vol.276, issue.1, pp.71-84, 1998. ,
Stereochemistry of cooperative effects in haemoglobin1, From theoretical physics to biology, pp.247-285, 1973. ,
Bioinformatics and functional genomics, 2015. ,
An integrated view of protein evolution, Nature Reviews Genetics, vol.7, issue.5, p.337, 2006. ,
Protein structure and function, 2008. ,
Permutation P-values should never be zero: calculating exact P-values when permutations are randomly drawn, Statistical Applications in Genetics and Molecular Biology, vol.9, issue.1, 2010. ,
Structural basis of eukaryotic cell-cell fusion, Cell, vol.157, issue.2, pp.407-419, 2014. ,
Optimization, approximation, and complexity classes, Journal of Computer and System Sciences, vol.43, issue.3, pp.425-440, 1991. ,
Fast protein structure alignment using Gaussian overlap scoring of backbone peptide fragment similarity, Bioinformatics, vol.28, issue.24, p.291, 1995. ,
URL : https://hal.archives-ouvertes.fr/hal-00756813
Areas, volumes, packing and protein structure, Ann. Rev. Biophys. Bioeng, vol.6, pp.151-176, 1977. ,
Calculating and scoring high quality multiple flexible protein structure alignments, Bioinformatics, p.300, 2016. ,
Clustering by fast search and find of density peaks, Science, vol.344, issue.6191, pp.1492-1496, 2014. ,
Viral molecular machines, vol.726, 2011. ,
The earth mover's distance as a metric for image retrieval, International Journal of Computer Vision, vol.40, issue.2, pp.99-121, 2000. ,
Cluster ensembles-a knowledge reuse framework for combining multiple partitions, Journal of machine learning research, vol.3, pp.583-617, 2002. ,
Efficient substructure RMSD query algorithms, Journal of Computational Biology, vol.14, issue.9, pp.1201-1207, 2007. ,
On a global upper bound for Jensen's inequality, Journal of Mathematical Analysis and Applications, vol.343, issue.1, pp.414-419, 2008. ,
The mutation ?99 Asp-Tyr stabilizes Y-A new, composite quaternary state of human hemoglobin, Proteins: Structure, Function, and Bioinformatics, vol.10, issue.2, pp.81-91, 1991. ,
Protein homology detection by hmm-hmm comparison, Bioinformatics, vol.21, issue.7, pp.951-960, 2004. ,
Capturing the hemoglobin allosteric transition in a single crystal form, Journal of the American Chemical Society, vol.136, issue.13, pp.5097-5105, 2014. ,
A revised proof of the metric properties of optimally superimposed vector sets ,
, Acta Crystallographica Section A: Foundations of Crystallography, vol.58, issue.5, pp.506-506, 2002.
Finding k-cuts within twice the optimal, SIAM J. Comp, vol.24, 1995. ,
Identification of common molecular subsequences, Journal of Molecular Biology, vol.147, issue.1, pp.195-197, 1981. ,
Fast, scalable generation of highquality protein multiple sequence alignments using clustal omega, Molecular Systems Biology, vol.7, issue.1, 2011. ,
Uniprot: the universal protein knowledgebase, Nucleic Acids Research, vol.45, issue.D1, pp.158-169, 2017. ,
Clustering ensembles: Models of consensus and weak partitions, IEEE transactions on pattern analysis and machine intelligence, vol.27, pp.1866-1881, 2005. ,
On the role of structural information in remote homology detection and sequence alignment: new methods using hybrid sequence profiles, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.63, issue.2, pp.1043-1062, 2001. ,
Least-squares estimation of transformation parameters between two point patterns, IEEE Transactions, vol.13, issue.4, pp.376-380, 1991. ,
A tutorial on spectral clustering, Statistics and Computing, vol.17, issue.4, pp.395-416, 2007. ,
Structures and mechanisms of viral membrane fusion proteins: multiple variations on a common theme, Critical reviews in biochemistry and molecular biology, vol.43, issue.3, pp.189-219, 2008. ,
Virus membrane fusion, FEBS letters, vol.581, issue.11, pp.2150-2155, 2007. ,
On the complexity of multiple sequence alignment, Journal of Computational Biology, vol.1, issue.4, p.8790475, 1994. ,
Predicting protein function from sequence and structural data, Current opinion in structural biology, vol.15, issue.3, pp.275-284, 2005. ,
DOI : 10.1016/j.sbi.2005.04.003
CSA: comprehensive comparison of pairwise protein structure alignments, Nucleic acids research, vol.40, issue.W1, pp.303-309, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00667920
Advances in homology protein structure modeling, Current Protein and Peptide Science, vol.7, issue.3, pp.217-227, 2006. ,
Survey of clustering algorithms, IEEE Transactions on neural networks, vol.16, issue.3, pp.645-678, 2005. ,
Flexible structure alignment by chaining aligned fragment pairs allowing twists, Bioinformatics, vol.19, issue.2, pp.246-255, 2003. ,
DOI : 10.1093/bioinformatics/btg1086
URL : https://academic.oup.com/bioinformatics/article-pdf/19/suppl_2/ii246/435771/btg1086.pdf
Statistical significance of probabilistic sequence alignment and related local hidden markov models, Journal of Computational Biology, vol.8, issue.3, pp.249-282, 2001. ,
LGA: a method for finding 3D similarities in protein structures, Nucleic acids research, vol.31, issue.13, pp.3370-3374, 2003. ,
A new mallows distance based metric for comparing clusterings, ICML, pp.1028-1035, 2005. ,
Approximating the Minimum k-way Cut in a Graph via Minimum 3-way Cuts, 1999. ,
TM-align: a protein structure alignment algorithm based on the tm-score, Nucleic acids research, vol.33, issue.7, pp.2302-2309, 2005. ,
2 Statistical significance of our motifs, when compared against random motifs with two non parametric two-sample tests. Second column: p-value for the Wilcoxon Mann-Whitney U test ,
,
,
,
,
,
,
,
,
,
, When qualified by the suffix iter
Method SBL executable Option Align-Apurva-SFD sbl-structural-motifs-chains-apurva.exe Align-Apurva-CD sbl-structural-motifs-chains-apurva.exe-use-cd-filtration Align-Kpax-SFD sbl-structural-motifs-chains-kpax.exe Align-Kpax-CD sbl-structural-motifs-chains-kpax.exe-use-cd-filtration Align-Identity-SFD sbl-structural-motifs-conformations ,
, This section is devoted to the proof of Theorem 6.1. For the sake of readability, we splitted this proof into three parts: Theorems D.2, D.5 and D.6. Notice that the last two proofs are quite similar
, We say that ? L-reduces to ? is there are two polynomial-time algorithms f , g and constants ?, ? > 0 such that for each instance I of ?: 1. Algorithm f produces an instance I = f (I) of ? such that the optima of I and I
Given any solution of I with cost c , algorithm g produces a solution of I with cost c such that OP T ? (I) ? c ? ?(OP T ? (I ) ? c ) ,
, It is known that if ? is AP X-hard and L-reduces to ? , then ? is AP X-hard as well, that case, ? does not admit a P T AS (Polynomial Time Approximation Scheme) unless P = N P
, For any D ? 2, the D-family-matching problem is AP X-hard even if the maximum degree ? is at most 4 and the weights are 2 and 5. In our reduction, we use a special case of set packing problem
an integer k ? 1, set packing problem consists in determining whether there exists a packing C of size |C| = k. Set packing problem is NP-complete even if |Y i | = 3 for every i ? {1 ,
, By Theorem 6.2, given D ? 1, there is an O(D 2 n)-time complexity algorithm for the Dfamily-matching problem because ? = 2. We prove in Lemma D.3 a better time complexity algorithm for the D-family-matching problem
Let E = {{v j , v j+1 } | 1 ? j ? n ? 1}. We define the function ? D as follows. For every t ? {1,. .. , n} and every i ? {max(1, t ? D),. .. , t + 1}, then ? D (v t , i) is the score of an optimal solution S of the D-family-matching problem, for the sub-path induced by the set of nodes {v 1, there exists an O(Dn)-time complexity algorithm for the D-family-matching problem for G. Proof of Lemma D.3. Let V = {v 1 ,
For every i ? {max ,
v t } is a set of this solution. We then modify this solution by adding node v t+1 in the last set, and we obtain the optimal solution for the D-family-matching problem, for the sub-path induced by the set of nodes {v i, p.1 ,
,
, Any solution must contain the set {v t+1 }. Thus, we have to consider an optimal solution for the D-family-matching problem for the sub-path induced by the set of nodes {v i ,. .. , v t }. We now prove the result for ? D (v t+1 , t + 2), p.1
,
, Let D ? N +. Consider any intersection graph G = (V, E, w) that is an even cycle. Then, there exists an O(D 2 n)-time complexity algorithm for the Dfamily-matching problem for G
, Consider any instance of the D-family-matching problem such that: ? for every i ? {1,. .. , r}, there exist j 1 , j 2 ? {1,. .. , r } such that F i ? F j = ? for any j ? {1
there exist i 1 , i 2 ? {1,. .. , r} such that F j ?F i = ? for any i ? {1 ,
, D 2 )-time complexity algorithm for the D-family-matching problem. Say otherwise, Corollary D.2 shows that there is a polynomial time algorithm for the D-family-matching problem if any set in F ? F has a non-empty intersection with at most two other sets of F ? F, Then, there exists an O((r + r )
, Let T r be any spanning tree of G rooted at node r ? V. For every v ? V , we define H(G, T r , v) as the set of all H ? H(G, v) such that the graph induced by the set of nodes V (H) ? V (T v ) is a (connected) sub-tree rooted at v. Let H(G, T r ) = ? v?V H(G, T r , v), D.6 Appendix-Generic approach based on spanning trees Let us first introduce some notations. For every v ? V , let H(G, v) be the set of all different sub-graphs of G that contain v and of diameter at most D. Let H(G) = ? v?V H(G, v)
, Let N (v) = {v 1 ,. .. , v q } be the set of q ? 1 neighbors of v in T v. Suppose we have computed ? D (v j , H) for every j ? {1, A leaf is a node of degree one and different than the root r
, Algorithms based on spanning trees Proof of Lemma 6.5. For some k ? 1, consider an optimal solution S = {S 1 ,. .. , S k } for the D-familymatching problem for G. For every i ? {1
By construction of T , S is an admissible solution for the D-family-matching problem for G, Let T be any rooted spanning tree of G such that E(T i ) ? E(T ) for every i ? ,
, Algorithm 1 returns ? D (G), that is an optimal solution for the D-family-matching problem for G, Given any positive integer D ? 1 and any intersection graph G
, Furthermore, the time complexity of Algorithm 1 is O(|T (G)| max Tr?T (G) h(G, T r ) ? n)
Let G be any intersection graph. Then, there exists a rooted spanning tree T of G ,
, For some k ? 1, consider an optimal solution S = {S 1 ,. .. , S k } for the 2-familymatching problem for G. For every i ? {1
Indeed, since D = 2, G[S i ] is necessarily a complete bipartite graph and its number of nodes is at most 2?. It is sufficient to select the maximum star as T, Let T be any rooted spanning tree of G such that E(T i ) ,
D) returns ? D (T ? ) (Theorem 6.2). prove the result by induction. Clearly, ? 1,?1(G),?1(G) (1) = 0. Assume that we have computed ? y,x ? ,x + (D) for every D ?, Given any intersection graph G, Algorithm 1 returns a 2?-approximation for the 2-familymatching problem for G if: ? ?(M) ,
, We necessarily have ? y,x ? ,x + (D +1) = ? y,x ? ,x + (D) because we cannot start a new plateau since x ? < ? D+1 (G) < x +
We cannot start a new plateau because x ? < x +. Thus we have to find the best y plateaus such that the lower bound is at least ? D+1 (G) = x ? and at most x +. We get that ? y,x ? ,x + (D + 1) = min x?P D+1 ,
,
In the second case, the score is minimum score among all the optimal solutions composed of y ?1 plateaus ,
D + 1) is the minimum among these two scores ,
, D+1 (G) < x ? or ? D+1 (G) > x + , then there is no admissible solution and, by convention, ? If ?
There are O(D 4 G ) such computations. All the cases (but the fourth), can be calculated in O(D G ) time. Thus, we get the O(D 5 G )-time complexity. Now consider the fourth case in which x ? = x +. Thus, for every D ? ,