Mining duplicate questions in stack overflow, Proceedings of the 13th International Conference on Mining Software Repositories, p.27, 2016. ,
An approach to clone detection in sequence diagrams and its application to security analysis, Software & Systems Modeling, vol.17, issue.4, pp.1287-1309, 2018. ,
The maintenance problem of application software: An empirical analysis, Journal of Software Maintenance: Research and Practice, vol.4, issue.2, pp.83-104, 1992. ,
An approach to clone detection in behavioural models, 20th Working Conference on Reverse Engineering (WCRE), p.23, 2013. ,
A program for identifying duplicated code, Computing Science and Statistics, pp.49-49, 1993. ,
On finding duplication and near-duplication in large software systems, Proceedings of 2nd Working Conference on Reverse Engineering, p.87, 1995. ,
Measuring clone based reengineering opportunities, Proceedings Sixth International Software Metrics Symposium (Cat. No. PR00403), p.10, 1999. ,
Clone detection using abstract syntax trees, Proceedings., International Conference on, vol.10, p.88, 1998. ,
Comparison and evaluation of clone detection tools. Software Engineering, IEEE Transactions on, vol.33, issue.9, pp.577-591, 2007. ,
Comparison and evaluation of clone detection tools, IEEE Transactions on software engineering, vol.33, issue.9, pp.577-591, 2007. ,
Can you trust a single data source exploratory software engineering case study?, Empirical Software Engineering, vol.7, pp.9-26, 2002. ,
Evaluating clone detection tools for use during preventative maintenance, Proceedings. Second IEEE International Workshop on Source Code Analysis and Manipulation, p.10, 2002. ,
Contributions à l'usage des détecteurs de clones pour des tâches de maintenance logicielle, p.85, 2016. ,
An empirical assessment of bellon's clone benchmark, Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering, pp.1-10, 2015. ,
Automated extraction of mixins in cascading style sheets, 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME), p.26, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-02182065
Comprehending reality-practical barriers to industrial adoption of software maintenance automation, 11th IEEE International Workshop on Program Comprehension, p.87, 2003. ,
The nicad clone detector, 2011 IEEE 19th International Conference on Program Comprehension, p.23, 2011. ,
Patterns for consistent software documentation, Proceedings of the 16th Conference on Pattern Languages of Programs, p.12, 2009. ,
Creating and evolving developer documentation: understanding the decisions of open source contributors, Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering, pp.127-136, 2010. ,
The development of a software clone detector, International Journal of Applied Software Technology, 1995. ,
A Study of the Documentation Essential to Software Maintenance, Proceedings of the 23rd Annual International Conference on Design of Communication: Documenting &Amp; Designing for Pervasive Information, SIGDOC '05, pp.68-75, 2005. ,
Clone detection in automotive model-based development, Proceedings of the 30th international conference on Software engineering, p.101, 2008. ,
Tool support for continuous quality control, IEEE software, vol.25, issue.5, pp.60-67, 2008. ,
Clone analysis in the web era: An approach to identify cloned web pages, Proceedings of the 7th IEEE Workshop on Empirical Studies of Software Maintenance (WESS'99), pp.19-26, 2001. ,
The curse of Copy&Paste cloning in requirements specifications, Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement, pp.443-446, 2009. ,
Tracking code clones in evolving software, 29th International Conference on Software Engineering (ICSE'07), p.29, 2007. ,
Clonetracker: tool support for code clone management, Proceedings of the 30th international conference on Software engineering, p.29, 2008. ,
A language independent approach for detecting duplicated code, Software Maintenance, 1999.(ICSM'99) Proceedings. IEEE International Conference on, p.15, 1999. ,
Fine-grained and Accurate Source Code Differencing, Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, ASE '14, pp.313-324, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01054552
Do code and comments co-evolve? on the relation between source code and comment changes, 14th Working Conference on, p.34, 2007. ,
The Relevance of Software Documentation, Tools and Technologies: A Survey, Proceedings of the 2002 ACM Symposium on Document Engineering, DocEng '02, pp.26-33, 2002. ,
Scalable detection of semantic clones, Proceedings of the 30th international conference on Software engineering, pp.321-330, 2008. ,
Generic modelling of code clones, Dagstuhl Seminar Proceedings. Schloss Dagstuhl-Leibniz-Zentrum fr Informatik, 2007. ,
Sim: a utility for detecting similarity in computer programs, ACM SIGCSE Bulletin, vol.31, pp.266-270, 1999. ,
Data mining: concepts and techniques, 2011. ,
Data clone detection and visualization in spreadsheets, 35th International Conference on Software Engineering (ICSE), p.27, 2013. ,
Index-based code clone detection: incremental, distributed, scalable, Software Maintenance (ICSM), 2010. ,
, IEEE International Conference on, vol.21, p.63
Cren: a tool for tracking copy-and-paste code clones and renaming identifiers consistently in the ide, Proceedings of the 2007 OOPSLA workshop on eclipse technology eXchange, pp.16-20, 2007. ,
Deckard: Scalable and accurate treebased detection of code clones, Proceedings of the 29th international conference on Software Engineering, pp.96-105, 2007. ,
Identifying redundancy in source code using fingerprints, Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: software engineering, vol.1, pp.171-183, 1993. ,
Can clone detection support quality assessments of requirements specifications?, Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, vol.2, pp.79-88, 2010. ,
Clonedetective-a workbench for clone detection research, Proceedings of the 31st International Conference on Software Engineering, pp.603-606, 2009. ,
Do code clones matter?, 2009 IEEE 31st International Conference on Software Engineering, pp.485-495, 2009. ,
Do Code Clones Matter?, Proceedings of the 31st International Conference on Software Engineering, ICSE '09, pp.485-495, 2009. ,
CCFinder: a multilinguistic token-based code clone detection system for large scale source code, IEEE Transactions on Software Engineering, vol.28, issue.7, pp.654-670, 2002. ,
Toward an understanding of software code cloning as a development practice, p.87, 2009. ,
Cloning considered harmful" considered harmful, Reverse Engineering, 2006. WCRE'06. 13th Working Conference on, pp.19-28, 2006. ,
Efficient randomized pattern-matching algorithms, IBM journal of research and development, vol.31, issue.2, pp.249-260, 1987. ,
Shinobi: A tool for automatic code clone detection in the ide, 16th Working Conference on Reverse Engineering, p.29, 2009. ,
Using slicing to identify duplication in source code, International Static Analysis Symposium, pp.40-56, 2001. ,
Pattern matching for clone and concept detection, Automated Software Engineering, vol.3, issue.1-2, pp.77-108, 1996. ,
Clone detection using abstract syntax suffix trees, 13th Working Conference on Reverse Engineering, p.20, 2006. ,
API documentation from source code comments: a case study of Javadoc, Proceedings of the 17th annual international conference on Computer documentation, pp.147-153, 1999. ,
Identifying similar code with program dependence graphs, Proceedings Eighth Working Conference on Reverse Engineering, p.20, 2001. ,
Assessing the benefits of incorporating function clone detection in a development process, Proceedings International Conference on Software Maintenance, p.28, 1997. ,
Understanding Someone else's Code: Analysis of Experiences, J. Syst. Softw, vol.23, issue.3, pp.269-275, 1993. ,
Finding function clones in web applications, Seventh European Conference onSoftware Maintenance and Reengineering, 2003. Proceedings, p.19, 2003. ,
How Software Engineers Use Documentation: The State of the Practice, IEEE Softw, vol.20, issue.6, pp.35-39, 2003. ,
UpSet: visualization of intersecting sets, IEEE transactions on visualization and computer graphics, vol.20, issue.12, pp.1983-1992, 2014. ,
Cclearner: A deep learningbased clone detection approach, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), p.21, 2017. ,
Cp-miner: Finding copy-paste and related bugs in large-scale software code, IEEE Transactions on software Engineering, vol.32, issue.3, pp.176-192, 2006. ,
Detecting duplicate pull-requests in github, Proceedings of the 9th Asia-Pacific Symposium on Internetware, p.26, 2017. ,
Gplag: detection of software plagiarism by program dependence graph analysis, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.872-881, 2006. ,
Detecting duplications in sequence diagrams based on suffix trees, Software Engineering Conference, p.13, 2006. ,
, , p.88
Analyzing web service similarity using contextual clones, Proceedings of the 5th International Workshop on Software Clones, pp.41-46, 2011. ,
Evaluating the benefits of clone detection in the software maintenance activities in large scale systems. WESS'96, 1996. ,
Experiment on the automatic detection of function clones in a software system using metrics, icsm, vol.96, p.19, 1996. ,
Collecting and leveraging a benchmark of build system clones to aid in quality assessments, Companion proceedings of the 36th international conference on software engineering, pp.145-154, 2014. ,
Triangulation as a basis for knowledge discovery in software engineering, Empirical Software Engineering, vol.13, issue.2, pp.223-228, 2008. ,
What Should Developers Be Aware Of? An Empirical Study on the Directives of API Documentation, Empirical Software Engineering, vol.17, issue.6, pp.703-737, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00702183
Matching and merging of statecharts specifications, Proceedings of the 29th international conference on Software Engineering, pp.54-64, 2007. ,
Clone management for evolving software, IEEE transactions on software engineering, vol.38, issue.5, pp.1008-1026, 2011. ,
Documentation reuse: Hot or not? An empirical study, International Conference on Software Reuse, pp.12-27, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-02182142
Handling duplicates in dockerfiles families: Learning from experts, 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), p.83, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02485839
A Technique for Software Module Specification with Examples. Commun, vol.15, pp.330-336, 1972. ,
Enforcing Strict Model-view Separation in Template Engines, Proceedings of the 13th International Conference on World Wide Web, WWW '04, pp.224-233, 2004. ,
Using bm25f for semantic search, Proceedings of the 3rd international semantic search workshop, p.26, 2010. ,
Code generation using javadoc, p.35, 2000. ,
Using server pages to unify clones in web applications: A trade-off analysis, 29th International Conference on Software Engineering (ICSE'07), p.25, 2007. ,
Bauhaus-a tool suite for program analysis and reverse engineering, International Conference on Reliable Software Technologies, pp.71-82, 2006. ,
Effective clone detection without language barriers, 2005. ,
Insights into system-wide code duplication, 11th Working Conference on Reverse Engineering, p.87, 2004. ,
A survey on software clone detection research. Queen's School of Computing TR, vol.541, pp.64-68, 2007. ,
Comparison and evaluation of code clone detection techniques and tools: A qualitative approach, Science of computer programming, vol.74, issue.7, pp.470-495, 2009. ,
The vision of software clone management: Past, present, and future (keynote paper), 2014 Software Evolution Week-IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering, p.28, 2014. ,
Detecting code clones in binary executables, Proceedings of the eighteenth international symposium on Software testing and analysis, pp.117-128, 2009. ,
Sourcerercc: Scaling code clone detection to big-code, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), p.17, 2016. ,
Qualitative methods in empirical studies of software engineering, IEEE Transactions on software engineering, vol.25, issue.4, pp.557-572, 1999. ,
Does your configuration code smell?, Mining Software Repositories (MSR), 2016 IEEE/ACM 13th Working Conference on, p.25, 2016. ,
Towards clone detection in UML domain models, Software & Systems Modeling, vol.12, issue.2, pp.307-329, 2013. ,
Towards more accurate retrieval of duplicate bug reports, Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering, pp.253-262, 2011. ,
A discriminative model approach for accurate duplicate bug report retrieval, Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, vol.1, pp.45-54, 2010. ,
Detecting duplicate bug report using character n-grambased features, 2010 Asia Pacific Software Engineering Conference, p.26, 2010. ,
The dimensions of maintenance, Proceedings of the 2nd international conference on Software engineering, pp.492-497, 1976. ,
Resolution of static clones in dynamic web pages, Fifth IEEE International Workshop on Web Site Evolution, p.29, 2003. ,
Phoenix-based clone detection using suffix trees, Proceedings of the 44th annual Southeast regional conference, pp.679-684, 2006. ,
HTML Templates That Fly: A Template Engine Approach to Automated Offloading from Server to Client, Proceedings of the 18th International Conference on World Wide Web, WWW '09, pp.951-960, 2009. ,
, , p.35, 2004.
The documentary structure of source code. Information and Software Technology, vol.44, pp.767-782, 2002. ,
Clone detection in source code by frequent itemset techniques. In Source Code Analysis and Manipulation, Fourth IEEE International Workshop on, p.18, 2004. ,
Deep learning code fragments for code clone detection, Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, pp.87-98, 2016. ,
Deep learning code fragments for code clone detection, Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, pp.87-98, 2016. ,
Experimentation in software engineering, pp.52-77, 2012. ,
Multi-method research: An empirical investigation of object-oriented technology, Journal of Systems and Software, vol.48, issue.1, pp.13-26, 1999. ,
Clospan: Mining: Closed sequential patterns in large datasets, Proceedings of the 2003 SIAM international conference on data mining, p.17, 2003. ,
Identifying syntactic differences between two programs. Software: Practice and Experience, vol.21, pp.739-755, 1991. ,
Towards contextual and on-demand code clone management by continuous monitoring, 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), p.28, 2013. ,
Learning to rank duplicate bug reports, Proceedings of the 21st ACM international conference on Information and knowledge management, p.26, 2012. ,
Analyzing and forecasting near-miss clones in evolving software: An empirical study, 16th IEEE International Conference on Engineering of Complex Computer Systems, p.87, 2011. ,
, Extract of a documentation duplication due to method delegation (in the Apache Commons Collection project)
,
, Example of code transformation
Example of a PI-controller model gathered from, p.24, 2008. ,
, The generated documentation by Yard for the from_secret_key method from the RbNaCL project
, Violin plot for the number of classes of each project in our for corpus (for both Java and Ruby)
, Right: Violin plot for the percentage of documented methods in every project in our corpus, Left: Violin plot for the number of methods in every project in our corpus
, Extract of a documentation duplication from the Apache Commons IO project. The duplicated tag is highlighted in red
, Violin plot for the percetange of duplicated tags per project (for both Java and Ruby)
, Left: Violin plot for the number of methods sharing a common tag in Java. Right: Violin plot for the number of methods sharing a common tag in Ruby, p.44
, Upper-right: Violin plot for the number of duplicate @params tags per project. Lower-left: Violin plot for the number of duplicate @return tags per project. Lower-right: Violin plot for the number of duplicate @throws (@raise for ruby) tags per project, Upper-left: Violin plot for the number of duplicate @description tags per project
, Extract of duplicate due to a delegation between two methods in the Ruby/Git library project. Duplicated tags are displayed in red
, 47 3.10 Example of duplicate due to sub-typing in the Apache Commons Collections project. Duplicated tags are displayed in red
, Example of duplicate due to code clone in the Apache Commons IO project. Duplicated tags are displayed in red
, Extract of duplicate due to a similar use between two methods in the Ruby/Git library project. Duplicated tags are displayed in red
, The stack of layers built from the Dockerfile with the corresponding final image size
, 61 4.3 RUN instruction with multiple shell commands split into two RUN instructions, one for each shell command
, Dockerfile presenting an example of duplicate index with chunk size set to 6, vol.63
, Extract of real Dockerfile duplicate from Bash shell v3
, Extract of real Dockerfile duplicate from Bash shell v4
, UpSet plot showing the relationships between versions, flavours, base images and platforms across our repositories
, Upper-right plot: Violin plot for the number of instructions per project. Bottom plot: Violin plot for the number of instructions by duplicate, p.67
, Stripplot of the number of owners of every duplicate in our corpus, p.68
, Violin plot for the percentage of co-evolving commits per project, p.69
, Left plot: Violin plot for the percentage of duplicate instructions in Dockerfiles of a project using Templates. Right plot: Violin plot for the percentage of duplicates reduction in projects using templates