/. Sessionsetuprequestsymbol, TreeConnectResponse_0 Figure 12.2 ? Inferred state machine of the Samba protocol (self-state transitions triggering an empty symbol and reaction time labels are removed for sake of clarity)

H. Sample, . Get, and .. With-highlighted-fields, 31 traffic generated by a host infected by, p.32

.. Message-can-be-split-into-words, 37 message can be split into words that are related to tokens, p.37

B. Ibm-bi, SynC protocol. 15 CAP Common Alerting Protocol, p.34

I. Group, Abstract syntax notation one (asn.1) -specification of basic notation, 2002.

F. Aarts, B. Jonsson, and J. Uijen, Generating models of infinite-state communication protocols using regular inference with abstraction, Proceedings of the 22nd IFIP WG 6.1 international conference on Testing software and systems, ICTSS'10, pp.188-204, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00767416

I. S. Abdullah and D. A. Menasce, Protocol specification and automatic implementation using xml and cbse, Proc of the International Conference on Communications, Internet and Infomation technology, 2003.

K. Aiko, New reverse engineering technique using api hooking and sysenter hooking, and capturing of cash card access, Black Hat Asia, 2008.

M. Amos, An intuitive explanation of cw bandwidth

D. Andriesse and H. Bos, An analysis of the zeus peer-to-peer protocol, 2013.

D. Angluin, Learning regular sets from queries and counterexamples, Information and Computation, vol.75, issue.2, pp.87-106, 1987.
DOI : 10.1016/0890-5401(87)90052-6

I. Anishchenko, Pb vs thrift vs avro, 2012.

J. Antunes, N. Neves, and P. Verissimo, Reverse Engineering of Protocols from Network Traces, 2011 18th Working Conference on Reverse Engineering, pp.169-178, 2011.
DOI : 10.1109/WCRE.2011.28

P. Ashar, S. Devadas, and A. R. Newton, Optimum and heuristic algorithms for an approach to finite state machine decomposition, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol.10, issue.3, pp.296-310, 2006.
DOI : 10.1109/43.67784

P. Ashar, S. Devadas, and A. Newton, A unified approach to the decomposition and re-decomposition of sequential machines, Conference proceedings on 27th ACM/IEEE design automation conference , DAC '90, pp.601-606, 1990.
DOI : 10.1145/123186.123414

P. Barford and V. Yegneswaran, An Inside Look at Botnets, Malware Detection of Advances in Information Security, 2007.
DOI : 10.1007/978-0-387-44599-1_8

A. Marshall and . Beddoe, Network protocol analysis using bioinformatics algorithms, Toorcon, 2004.

P. L. Berg and T. Berg, Structure in Language: A Dynamic Perspective. Routledge Studies in Linguistics, 2008.

T. Berg, B. Jonsson, M. Leucker, and M. Saksena, Insights to Angluin's Learning, Electronic Notes in Theoretical Computer Science, vol.118, pp.3-18, 2005.
DOI : 10.1016/j.entcs.2004.12.015

T. Berg, B. Jonsson, and S. Soleimanifard, Inferring compact models of communication protocol entities In Leveraging Applications of Formal Methods, Verification, and Validation: Part I, number 6415 in Lecture Notes in Computer Science, pp.658-672, 2010.

M. Bishop, R. Crawford, B. Bhumiratana, L. Clark, and K. Levitt, Some Problems in Sanitizing Network Data, 15th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE'06), pp.307-312, 2006.
DOI : 10.1109/WETICE.2006.62

G. Bochmann, Protocol specification for OSI, Computer Networks and ISDN Systems, vol.18, issue.3, pp.167-184, 1990.
DOI : 10.1016/0169-7552(90)90132-C

T. Bohlin and B. Jonsson, Regular inference for communication protocol entities, 2008.

P. Borgnat, G. Dewaele, K. Fukuda, P. Abry, and K. Cho, Seven Years and One Day: Sketching the Evolution of Internet Traffic, IEEE INFOCOM 2009, The 28th Conference on Computer Communications, pp.711-719, 2009.
DOI : 10.1109/INFCOM.2009.5061979

URL : https://hal.archives-ouvertes.fr/ensl-00290756

G. Bossert and F. Guihéry, The future of protocol reversing and simulation applied on zeroaccess botnet, 29C3: 29th Chaos Communication Congress, 2012.

G. Bossert and F. Guihery, Security evaluation of communication protocols in cc, 2012.

G. Bossert, F. Guihery, and G. Hiet, Towards automated protocol reverse engineering using semantic information, Proceedings of the 9th ACM symposium on Information, computer and communications security, ASIA CCS '14, 2014.
DOI : 10.1145/2590296.2590346

URL : https://hal.archives-ouvertes.fr/hal-01009283

G. Bossert, G. Hiet, and T. Henin, Modelling to Simulate Botnet Command and Control Protocols for the Evaluation of Network Intrusion Detection Systems, 2011 Conference on Network and Information Systems Security, pp.1-8, 2011.
DOI : 10.1109/SAR-SSI.2011.5931397

URL : https://hal.archives-ouvertes.fr/hal-00658396

G. Bossert and D. Kirchner, How to play hooker, SSTIC, 2014.

J. Caballero, P. Poosankam, C. Kreibich, and D. Song, Dispatcher, Proceedings of the 16th ACM conference on Computer and communications security, CCS '09, 2009.
DOI : 10.1145/1653662.1653737

J. Caballero and D. Song, Automatic protocol reverse-engineering: Message format extraction and field semantics inference, Computer Networks, vol.57, issue.2, pp.451-474, 2013.
DOI : 10.1016/j.comnet.2012.08.003

J. Caballero, H. Yin, Z. Liang, and D. Song, Polyglot, Proceedings of the 14th ACM conference on Computer and communications security , CCS '07, 2007.
DOI : 10.1145/1315245.1315286

J. D. Case, M. Fedor, M. L. Schoffstall, and J. Davin, Simple Network Management Protocol (SNMP), RFC, vol.1157, 1990.

A. Cavalli, C. Grepet, S. Maag, and V. Tortajada, A validation model for the DSR protocol, 24th International Conference on Distributed Computing Systems Workshops, 2004. Proceedings., pp.768-773, 2004.
DOI : 10.1109/ICDCSW.2004.1284120

I. M. Chakravarti, R. G. Laha, and J. Roy, Handbook of methods of applied statistics, Wiley series in probability and mathematical statistics, 1967.

M. Chiesa, L. Cittadini, G. D. Battista, L. Vanbever, and S. Vissicchio, Using routers to build logic circuits: How powerful is BGP?, 2013 21st IEEE International Conference on Network Protocols (ICNP), 2013.
DOI : 10.1109/ICNP.2013.6733584

D. Chia-yuan-cho, E. C. Babi´cbabi´c, R. Shin, and D. Song, Inference and analysis of formal models of botnet command and control protocols, Proceedings of the 17th ACM conference on Computer and communications security, CCS '10, pp.426-439, 2010.

H. Choi, H. Lee, and H. Kim, BotGAD, Proceedings of the Fourth International ICST Conference on COMmunication System softWAre and middlewaRE, COMSWARE '09, pp.1-2, 2009.
DOI : 10.1145/1621890.1621893

N. Chomsky, Syntactic Structures. Mouton classic. Bod Third Party Titles, 2002.

N. Chomsky, Three models for the description of language. Information Theory, IRE Transactions on, vol.2, issue.3, pp.113-124, 1956.

T. S. Chow, Testing Software Design Modeled by Finite-State Machines, IEEE Transactions on Software Engineering, vol.4, issue.3
DOI : 10.1109/TSE.1978.231496

J. Bremer, A. T. , C. Guarnieri, and M. Schloesser, Cuckoo sandbox open source automated malware analysis, Black Hat USA, 2013.

. Prospex, Protocol specification extraction, Proceedings of SSP, 2009.

D. Crocker and P. Overell, Augmented BNF for Syntax Specifications: ABNF, RFCINTERNET STANDARD), vol.5234, 2008.

W. Cui and . Discoverer, Automatic protocol reverse engineering from network traces, Proceedings of USENIX Security Symposium, 2007.

W. Cui, V. Paxson, N. C. Weaver, and Y. H. Katz, Protocol-independent adaptive replay of application dialog, The 13th Annual Network and Distributed System Security Symposium (NDSS, 2006.

S. Devadas and A. R. Newton, Decomposition and factorization of sequential finite state machines, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol.8, issue.11, pp.1206-1217, 2006.
DOI : 10.1109/43.41505

S. Djiev, Industrial networks for communication and control, 2003.

L. Dolhi, Validation of Communications Systems with SDl. TransMeth Sud-Ouest, 2003.

M. Franz, E. J. Byres, and D. Miller, The use of attack trees in assessing vulnerabilities in scada systems, International Infrastructure Survivability Workshop, 2004.

. Fortinet, Anatomy of a botnet, 2013.

P. Francis, Pip Near-term Architecture, RFCInformational), vol.1621, 1994.
DOI : 10.17487/rfc1621

J. Freeman, Hacking a closed ecosystem, O'Reilly Android Open Conference, 2011.

S. Fujiwara, G. Von-bochmann, F. Khendek, M. Amalou, and A. Ghedamsi, Test selection based on finite state models, IEEE Transactions on Software Engineering, vol.17, issue.6, pp.591-603, 1991.
DOI : 10.1109/32.87284

U. Gargi, Consumer media capture: Time-based analysis and event clustering, 2003.

E. Mark and G. , Complexity of automaton identification from given data, Information and Control, vol.37, issue.3, pp.302-320, 1978.

G. Gu, P. Porras, V. Yegneswaran, M. Fong, and W. Lee, Bothunter: detecting malware infection through ids-driven dialog correlation, Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, pp.1-12, 2007.

D. Harel and P. S. Thiagarajan, UML for Real: Design of Embedded Real-time Systems, chapter Message Sequence Charts, p.1, 2003.

J. Hartmanis, Algebraic Structure Theory of Sequential Machines, 1966.

Z. Hasan and M. J. Ciesielski, Decomposition and functional verification of fsms, 1998.

P. Hethmon, Extensions to FTP, RFC, vol.3659, 2007.
DOI : 10.17487/rfc3659

C. D. Charles and A. R. Hoare, Grammatical inference: learning automata and grammars Communicating sequential processes, 1985.

J. Gerard and . Holzmann, Design and validation of computer protocols, 1991.

F. Howard, Exploring the blackhole exploit kit, 2012.

Y. Hsu, G. Shu, and D. Lee, A model-based approach to security flaw detection of network protocol implementations, 2008 IEEE International Conference on Network Protocols, pp.114-123, 2008.
DOI : 10.1109/ICNP.2008.4697030

M. Naeem and I. , Analysis and optimization of software model inference algorithms, 2012.

. Iso, Information processing systems ? OSI reference model, international standards organization, ISO, 1984.

C. Kalt, Internet Relay Chat: Architecture, RFC, vol.2810, 2000.
DOI : 10.17487/rfc2810

M. Kende, Global internet report 2014, 2014.

S. Kent, Privacy Enhancement for Internet Electronic Mail: Part II: Certificate-Based Key Management, RFC, vol.1422, 1993.

S. Kent and K. Seo, Security Architecture for the Internet Protocol, RFC, vol.4301, 2005.

S. C. Kleene, Representation of Events in Nerve Nets and Finite Automata. Memorandum (Rand Corporation) Rand Corporation, 1951.

C. Tridib and K. Philip, Analysis of the train communication network protocol error detection capabilities. Working paper, 2001.

T. Krueger, H. Gascon, N. Krämer, and K. Rieck, Learning stateful models for network honeypots, Proceedings of the 5th ACM workshop on Security and artificial intelligence, AISec '12, 2012.
DOI : 10.1145/2381896.2381904

T. Krueger, N. Kramer, and K. Rieck, ASAP: Automatic Semantics-Aware Analysis of Network Payloads, Proceedings of ECML/PKDD, 2011.
DOI : 10.1016/j.jss.2009.06.040

C. Labovitz, S. Iekel-johnson, D. Mcpherson, J. Oberheide, and F. Jahanian, Internet inter-domain traffic, SIGCOMM Comput. Commun. Rev, vol.41, issue.4, 2010.
DOI : 10.1145/1851182.1851194

K. J. Lang, Faster algorithms for finding minimal consistent dfas, NEC Research Institute, 4 Independence Way, 1999.

D. Daniel, H. S. Lee, and . Seung, Algorithms for non-negative matrix factorization, NIPS, 2000.

S. Legg, Generic String Encoding Rules (GSER) for ASN.1 Types, RFC, vol.3641, 2003.
DOI : 10.17487/rfc3641

C. Leita, K. Mermoud, and M. Dacier, ScriptGen: an automated script generation tool for honeyd, 21st Annual Computer Security Applications Conference (ACSAC'05), 2005.
DOI : 10.1109/CSAC.2005.49

Z. Lin, X. Jiang, D. Xu, and X. Zhang, Automatic protocol format reverse engineering through context-aware monitored execution, 15TH SYMPOSIUM ON NETWORK AND DISTRIBUTED SYSTEM SECURITY (NDSS, 2008.

L. Logrippo, M. Faci, and M. Haj-hussein, An introduction to LOTOS: learning by examples, Computer Networks and ISDN Systems, vol.23, issue.5
DOI : 10.1016/0169-7552(92)90011-E

G. M. Lundy and C. Basaran, Automated generation of protocol test sequences from formal specifications, Proceedings of ICNP, 1994 International Conference on Network Protocols, pp.72-79, 1994.
DOI : 10.1109/ICNP.1994.344374

S. Luo and G. A. Marin, Modeling networking protocols to test intrusion detection systems, Proceedings of the 29th Annual IEEE International Conference on Local Computer Networks, LCN '04, pp.774-775, 2004.

E. Madelaine and D. Vergamini, Specification and Verification of a Sliding Window Protocol in LOTOS, Formal Description Techniques, IV, volume C-2 of IFIP Transactions
DOI : 10.1016/B978-0-444-89402-1.50045-X

K. Mcnamee, Malware analysis report -new c&c protocol for zeroacess/siref, 2012.

R. Milner, Communication and concurrency, 1989.

J. C. Mogul, R. Fielding, J. Gettys, and H. Frystyk, Use and Interpretation of HTTP Version Numbers, RFC, vol.2145, 1997.

B. Saul, C. D. Needleman, and . Wunsch, A general method applicable to the search for similarities in the amino acid sequence of two proteins, Journal of Molecular Biology, vol.48, issue.3, pp.443-453, 1970.

O. Niese, An Integrated Approach to Testing Complex Systems, Erlangung des Grades eines Doktors der Naturwissenschaften der Universitat Dortmund am Fachbereich Informatik, 2003.

V. Nivargi, M. Bhaowal, and T. Lee, Machine learning based botnet detection, 2006.

O. Devicenet, Technical overview, 2004.

W. Ogden, A helpful result for proving inherent ambiguity, Mathematical systems theory, pp.191-194, 1968.
DOI : 10.1007/BF01694004

O. Iso, Estelle -a formal description technique based on an extended state transition model Internationnal Organisation for Standardisation, 1989.

R. Pang and V. Paxson, A high-level programming environment for packet trace anonymization and transformation, Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications , SIGCOMM '03, pp.339-351, 2003.
DOI : 10.1145/863955.863994

C. P. Pfleeger, State reduction in incompletely specified finite-state machines. Computers, IEEE Transactions, issue.12, pp.221099-1102, 1973.

H. Raffelt, B. Steffen, and T. Berg, LearnLib, Proceedings of the 10th international workshop on Formal methods for industrial critical systems , FMICS '05, pp.62-71, 2005.
DOI : 10.1145/1081180.1081189

URL : https://hal.archives-ouvertes.fr/inria-00459959

N. David, Y. A. Reshef, H. K. Reshef, S. R. Finucane, G. Grossman et al., Detecting novel associations in large data sets, Science, issue.6062, pp.3341518-1524, 2011.

P. Resnick, Internet Message Format RFC 2822 (Proposed Standard), p.5336, 2001.

P. Resnick, Internet Message Format, RFC, vol.5322, 2008.

K. Rieck, C. Wressnegger, and A. Bikadorov, Sally: A tool for embedding strings in vector spaces, Journal of Machine Learning Research, 2012.

E. Rodionov and A. Matrosov, The evolution of tdl: conquering x64, 2011.

M. Roesch, Snort -lightweight intrusion detection for networks, Proceedings of the USENIX LISA'99 conference, pp.229-238, 1999.

E. C. Rosen, Vulnerabilities of network control protocols, ACM SIGCOMM Computer Communication Review, vol.11, issue.3, 1981.
DOI : 10.1145/1015591.1015592

P. Saint-andre, Extensible Messaging and Presence Protocol (XMPP): Instant Messaging and Presence, RFC, vol.6121, 2011.

B. Sanou, Ict facts and figures, 2013.

T. Sauter, The Three Generations of Field-Level Networks—Evolution and Compatibility Issues, IEEE Transactions on Industrial Electronics, vol.57, issue.11, 2010.
DOI : 10.1109/TIE.2010.2062473

J. Shearer and . Trojan, zeroaccess threat report, 2011.

S. Shoham, E. Yahav, S. Fink, and M. Pistoia, Static specification mining using automata-based abstractions, Proceedings of the 2007 international symposium on Software testing and analysis, ISSTA '07, pp.174-184, 2007.
DOI : 10.1145/1273463.1273487

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.9497

A. Showk, D. Szczesny, S. Traboulsi, I. Badr, E. Gonzalez et al., Modeling LTE Protocol for Mobile Terminals Using a Formal Description Technique
DOI : 10.1007/978-3-642-74238-5

A. Shulman, The untold tale of database communication protocol vulnerabilities, BlackHat, 2007.

R. R. Sokal and C. D. Michener, A statistical method for evaluating systematic relationships, University of Kansas Scientific Bulletin, vol.28, pp.1409-1438, 1958.

A. Sosnovich, O. Grumberg, and G. Nakibly, Finding Security Vulnerabilities in a Network Protocol Using Parameterized Systems, Computer Aided Verification, 2013.
DOI : 10.1007/978-3-642-39799-8_51

G. Starnberger, C. Kruegel, and E. Kirda, Overbot, Proceedings of the 4th international conference on Security and privacy in communication netowrks, SecureComm '08, pp.1-13, 2008.
DOI : 10.1145/1460877.1460894

B. Steffen, F. Howar, and M. Merten, Introduction to Active Automata Learning from a Practical Perspective, Formal Methods for Eternal Networked Software Systems, 2011.
DOI : 10.1007/978-3-642-21455-4_8

URL : https://hal.archives-ouvertes.fr/hal-00647729

J. Stewart, Inside the storm: Protocols and encryption of the storm botnet, Black Hat USA, 2008.

A. Syropoulos, Mathematics of Multisets, Proceedings of the Workshop on Multiset Processing: Multiset Processing, Mathematical, Computer Science, and Molecular Computing Points of View, pp.347-358, 2001.
DOI : 10.1007/3-540-45523-X_17

D. Tasak, Specification and validation of q.2931 atm signaling protocol using estelle

I. Tellier, Learning recursive automata from positive examples. Revue d'Intelligence Artificielle, pp.775-804, 2006.
URL : https://hal.archives-ouvertes.fr/inria-00470101

G. Tenebro and . W32, waledac -threat analysis, 2009.

J. Turla, Analysis on pbot ? a php irc bot that has malicious functions, 2012.

Y. Wang, X. Yun, M. Z. Shafiq, L. Wang, A. X. Liu et al., Danfeng(Daphne) Yao, Yongzheng Zhang, and Li Guo. A semantics aware approach to automated reverse engineering unknown protocols, Proceedings of ICNP, 2012.

Y. Wang, Z. Zhang, D. D. Yao, B. Qu, and L. Guo, Inferring Protocol State Machine from Network Traces: A Probabilistic Approach, Proceedings of ACNS, 2011.
DOI : 10.1002/9780470316801

Z. Wang, X. Jiang, W. Cui, X. Wang, and M. Grace, ReFormat: Automatic Reverse Engineering of Encrypted Messages, Proceedings of the 14th European Conference on Research in Computer Security, ESORICS'09, pp.200-215, 2009.
DOI : 10.1016/S1389-1286(99)00112-7

Y. Won, R. Fontugne, K. Cho, H. Esaki, and K. Fukuda, Nine years of observing traffic anomalies: Trending analysis in backbone networks, Integrated Network Management 2013 IFIP/IEEE International Symposium on, pp.636-642, 2013.

G. Wondracek, P. M. Comparetti, C. Kruegel, and E. Kirda, Automatic network protocol analysis, Proceedings of the 15th Annual Network and Distributed System Security Symposium (NDSS'08, 2008.

P. Wurzinger, L. Bilge, T. H. Goebel, C. Kruegel, and E. Kirda, Automatically Generating Models for Botnet Detection, Proceedings of the 14th European conference on Research in computer security, ESORICS'09, pp.232-249, 2009.
DOI : 10.1007/978-3-540-70542-0_6

T. Xie, Software component protocol inference, 2003.

T. Yeh, T. Chang, and R. C. Miller, Sikuli, Proceedings of the 22nd annual ACM symposium on User interface software and technology, UIST '09, pp.183-192, 2009.
DOI : 10.1145/1622176.1622213

K. Zeilenga, Lightweight Directory Access Protocol (LDAP): Technical Specification Road Map, RFC, vol.4510, 2006.
DOI : 10.17487/rfc4510