.. , 90 10.2 Definitions Abstract Syntax of the Considered ShEx Fragment, Selectivity Estimation for SPARQL Triple Patterns Contents 10.1 Introduction 91 10.2.2 Preliminary Definitions . . . . . . . . . . . . . . . . . . . . . . . . 92

.. Well-formed-data-schema-pairs, 92 10.3.1 Cardinality Constraints, Data Nodes Isolation, p.95

.. Shape-relation-graph, , p.95

R. , SPARQL Query Triple Rankings . . . . . . . . . . . . . . . . . . . 100

E. , 100 10.6.1 Experiment 1: With Web Index, With LDBC SNB Schema and gMark Queries . . . . . 103

C. , 106 1. in order to obtain SPARQL query transformations of different types? 2. in order to characterise different fragments for which an optimized representation of the original query could be generated? Bibliography [OWL, 2012] (2012). OWL 2 web ontology language document overview (second edition). W3C recommendation , W3C

. Abadi, Scalable semantic web data management using vertical partitioning, Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB '07, pp.411-422, 2007.

F. Aberer, K. Aberer, and G. Fischer, Semantic query optimization for methods in object-oriented database systems, Proceedings of the Eleventh International Conference on Data Engineering, pp.70-79, 1995.
DOI : 10.1109/ICDE.1995.380406

, RDFUnit: An RDF Unit Testing Suite. https://github, RDFUnit, 2016.

P. Arenas, M. Arenas, and J. Pérez, Querying semantic web data with SPARQL, Proceedings of the 30th symposium on Principles of database systems of data, PODS '11, pp.305-316, 2011.
DOI : 10.1145/1989284.1989312

. Bagan, gMark: Schema-Driven Generation of Graphs and Queries, IEEE Transactions on Knowledge and Data Engineering, vol.29, issue.4, pp.856-869, 2017.
DOI : 10.1109/TKDE.2016.2633993

URL : https://hal.archives-ouvertes.fr/hal-01402575

. Benzaken, Optimizing XML querying using type-based document projection, ACM Transactions on Database Systems, vol.38, issue.1, pp.1-445, 2013.
DOI : 10.1145/2445583.2445587

URL : https://hal.archives-ouvertes.fr/hal-00798049

T. Berners-lee and J. Jaffe, About W3C. https://www.w3.org/ Consortium, 1994.

C. Bizer and A. Schultz, The Berlin SPARQL benchmark, Int. J. Semantic Web Inf. Syst, vol.5, issue.2, pp.1-24, 2009.

. Boneva, Validating RDF with shape expressions, 1270.

G. Brickley, D. Brickley, and R. Guha, RDF schema 1.1. W3C recommendation, W3C, 2014.

C. , M. Chandra, A. K. Merlin, and P. M. , Optimal implementation of conjunctive queries in relational data bases, Proceedings of the Ninth Annual ACM Symposium on Theory of Computing, STOC '77, pp.77-90, 1977.

, On the containment of SPARQL queries under entailment regimes, Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI'16, pp.936-942, 2016.

. Chekol, PSPARQL query containment, The 13th International Symposium on Database Programming Languages, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00619342

. Chekol, SPARQL Query Containment under RDFS Entailment Regime, Automated Reasoning: 6th International Joint Conference, IJCAR 2012. Proceedings, pp.134-148, 2012.
DOI : 10.1007/978-3-642-31365-3_13

URL : https://hal.archives-ouvertes.fr/hal-00749087

. Chekol, SPARQL query containment under SHI axioms, Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, AAAI'12, pp.10-16, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00749080

. Colazzo, . Sartiani, D. Colazzo, and C. Sartiani, Typing regular path query languages for data graphs, Proceedings of the 15th Symposium on Database Programming Languages, DBPL 2015, pp.69-78, 2015.
DOI : 10.1007/3-540-45061-0_53

URL : https://hal.archives-ouvertes.fr/hal-01593601

. Cyganiak, RDF 1.1 concepts and abstract syntax. W3C recommendation, W3C, 2014.

S. Duerst, M. Duerst, and M. Suignard, RFC 3987: Internationalized Resource Identifiers (IRIs) RFC 3987 (Proposed Standard), see http, 2005.

. Erling, The LDBC Social Network Benchmark, Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD '15, pp.619-630, 2015.
DOI : 10.1146/annurev.soc.27.1.415

. Etessami, First-Order Logic with Two Variables and Unary Temporal Logic, Information and Computation, vol.179, issue.2, pp.279-295, 2002.
DOI : 10.1006/inco.2001.2953

. Fernández, Binary RDF representation for publication and exchange ( HDT ), Web Semantics: Science, Services and Agents on the World Wide Web, vol.19, pp.22-41, 2013.
DOI : 10.1016/j.websem.2013.01.002

A. Hbase, http://hbase.apache.org/. [Online; accessed 26, 2017.

A. Hadoop, http://hadoop.apache.org/. [Online; accessed 26, 2017.

J. E. Gayo, ShExC vs SHACL. https://github.com/labra/ShExcala/wiki/ ShExC-vs-SHACL. [Online; accessed 27, 2015.

J. E. Gayo, , 2016.

S. Shex-vs, https://www.slideshare.net/jelabra/ shex-vs-shacl. [Online; accessed 27, 2017.

J. E. Gayo, SHACL/ShEx implementation. https://github.com/labra/shaclex. [Online; accessed 27, 2017.

. Gayo, Validating and describing linked data portals using rdf shape expressions, 1st Workshop on Linked Data Quality (LDQ), number 1215 in CEUR Workshop Proceedings, 2014.

L. Genevès, P. Genevès, and N. Layaïda, A system for the static analysis of XPath, ACM Transactions on Information Systems, vol.24, issue.4, pp.475-502, 2006.
DOI : 10.1145/1185877.1185882

L. Genevès, P. Genevès, and N. Layaïda, Deciding XPath containment with MSO, Data Warehouse and Knowledge Discovery (DAWAK '05), pp.108-136, 2007.
DOI : 10.1016/j.datak.2006.11.003

P. Genevès and N. Layaïda, XML reasoning made practical, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010), pp.1169-1172, 2010.
DOI : 10.1109/ICDE.2010.5447786

. Goasdoué, Cliquesquare: efficient hadoop-based rdf query processing, 2013.

, On the decision problem for two-variable first-order logic, Bulletin of Symbolic Logic, vol.3, issue.1, pp.53-69, 1997.

. Grädel, . Otto, E. Grädel, and M. Otto, On logics with two variables, Theoretical Computer Science, vol.224, issue.1-2, pp.73-113, 1999.
DOI : 10.1016/S0304-3975(98)00308-9

. Graux, SPARQLGX: Efficient Distributed Evaluation of??SPARQL with Apache Spark, The Semantic Web ? ISWC 2016: 15th International Semantic Web Conference Proceedings, Part II, pp.80-87, 2016.
DOI : 10.14778/1687553.1687609

URL : https://hal.archives-ouvertes.fr/hal-01344915

. Graux, SPARQLGX in action: Efficient distributed evaluation of SPARQL with apache spark, Proceedings of the ISWC 2016 Posters & Demonstrations Track co-located with 15th International Semantic Web Conference 1690 of CEUR Workshop Proceedings. CEUR-WS.org, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01358125

R. V. Guha, ISWC 2013 Keynote -Ramanathan V. Guha. http://iswc2013. semanticweb.org/content/keynote-ramanathan-v-guha.html. [Online; accessed 27, 2013.

S. Harris, S. Harris, and A. Seaborne, SPARQL 1.1 query language. W3C recommendation, W3C, pp.11-20130321, 2013.

, FHIR. http://hl7.org/fhir/index.html. [Online; accessed 27, 2017.

. Huang, Scalable SPARQL querying of large RDF graphs, Proc. VLDB Endow, 2011.

, Open Public Data, openpublicdata.com, 2017.

. Joshi, Logical Linked Data Compression, pp.170-184, 2013.
DOI : 10.1007/978-3-642-38288-8_12

. Kellogg, , 2014.

. Kim, Type-based Semantic Optimization for Scalable RDF Graph Pattern Matching, Proceedings of the 26th International Conference on World Wide Web, WWW '17, pp.785-793, 2017.
DOI : 10.14778/2002974.2002976

H. Knublauch, Schema.org (converted to SHACL by TopQuadrant) http: //datashapes.org/schema. [Online; accessed 27, 2017.

. Knublauch, . Kontokostas, H. Knublauch, and D. Kontokostas, Shapes constraint language (SHACL). W3C recommendation, W3C. https, 2017.

D. Kontokostas, Shape Expressions Community Group. https://www.w3.org/ community/shex/. [Online; accessed 27, 2017.

. Kostylev, SPARQL with Property Paths, Proceedings of the 14th International Conference on The Semantic Web -ISWC 2015, pp.3-18, 2015.
DOI : 10.1145/1804669.1804675

. Lee, . Liu, K. Lee, and L. Liu, Scaling queries over big RDF graphs with semantic hash partitioning, Proceedings of the VLDB Endowment, vol.6, issue.14, pp.1894-1905, 2013.
DOI : 10.14778/2556549.2556571

. Letelier, Static analysis and optimization of semantic web queries, Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS '12, pp.89-100, 2012.

. Libkin, Querying graph databases with XPath, Proceedings of the 16th International Conference on Database Theory, ICDT '13, pp.129-140, 2013.
DOI : 10.1145/2448496.2448513

A. O. Mendelzon and P. T. Wood, Finding Regular Simple Paths in Graph Databases, SIAM Journal on Computing, vol.24, issue.6, pp.1235-1258, 1995.
DOI : 10.1137/S009753979122370X

W. Neumann, T. Neumann, and G. Weikum, RDF-3X, Proceedings of the VLDB Endowment, vol.1, issue.1, pp.647-659, 2008.
DOI : 10.14778/1453856.1453927

. Pan, Graph Pattern Based RDF Data Compression, Semantic Technology: 4th Joint International Conference, pp.239-256, 2014.
DOI : 10.1007/978-3-319-15615-6_18

. Papailiou, , 2013.

, H2RDF+: High-performance distributed joins over large-scale RDF graphs, 2013 IEEE International Conference on Big Data, pp.255-263

P. Patel-schneider and P. Hayes, RDF 1.1 semantics. W3C recommendation, W3C, pp.11-20140225, 2014.

. Pérez, nSPARQL: A Navigational Language for RDF, pp.66-81, 2008.

. Pérez, Semantics and complexity of sparql, ACM Trans. Database Syst, vol.3416, issue.3, pp.1-1645, 2009.

. Pham, Deriving an Emergent Relational Schema from RDF Data, Proceedings of the 24th International Conference on World Wide Web, WWW '15, pp.864-874, 2015.
DOI : 10.1007/978-3-642-12026-8_44

S. Pichler, R. Pichler, and S. Skritek, Containment and equivalence of well-designed SPARQL, Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, PODS '14, pp.39-50, 2014.
DOI : 10.1145/2594538.2594542

A. Polleres, From SPARQL to rules (and back), Proceedings of the 16th international conference on World Wide Web , WWW '07, pp.787-796, 2007.
DOI : 10.1145/1242572.1242679

. Prud-'hommeaux and E. Prud-'hommeaux, Shape expressions (shex) json formats, 2017.

. Prud-'hommeaux and E. Prud-'hommeaux, Shape expressions (shex) primer, 2017.

. Prud-'hommeaux, Shape expressions language 2.0, 2017.

. Prud-'hommeaux, Shape expressions: An rdf validation and transformation language, Proceedings of the 10th International Conference on Semantic Systems, SEM '14, pp.32-40, 2014.

. Prud-'hommeaux, . Seaborne, E. Prud-'hommeaux, and A. Seaborne, SPARQL query language for RDF. W3C recommendation, W3C, 2008.

. Robie, XML path language (XPath) 3.1. W3C recommendation, W3C. https, pp.31-20170321, 2017.

. Schmidt, Foundations of SPARQL query optimization, Proceedings of the 13th International Conference on Database Theory, ICDT '10, pp.4-33, 2010.
DOI : 10.1145/1804669.1804675

G. Schreiber and Y. Raimond, RDF 1.1 primer. W3C note, W3C, pp.11-20140624, 2014.

S. Schulz, System Description: E??1.8, pp.735-743, 2013.
DOI : 10.1007/978-3-642-45221-5_49

H. Seaborne, A. Seaborne, and S. Harris, SPARQL 1.1 query language. W3C recommendation, W3C, pp.11-20130321, 2013.

. Serfiotis, Containment and Minimization of RDF/S Query Patterns, Proceedings of the 4th International Conference on The Semantic Web, ISWC'05, pp.607-623, 2005.
DOI : 10.1007/11574620_44

. Staworko, Complexity and Expressiveness of ShEx for RDF, 18th International Conference on Database Theory (ICDT 2015), volume 31 of Leibniz International Proceedings in Informatics (LIPIcs) Schloss Dagstuhl?Leibniz-Zentrum fuer Informatik, pp.195-211, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01218552

G. Sutcliffe, The CADE ATP System Competition, 2017.
DOI : 10.1007/978-3-540-25984-8_36

G. Sutcliffe, TPTP Format for Problems, Problems.html, 2017.

, The Apache Software Foundation, 2011] The Apache Software Foundation, 2011.

A. Jena, Online; accessed 17, 2017.

, SHACL Tutorial: Getting Started. https://www.topquadrant. com/technology/shacl, 2001.

. Weidenbach, SPASS Version 3.5, Proceedings of the 22Nd International Conference on Automated Deduction, pp.22-140, 2009.
DOI : 10.1007/978-3-540-73595-3_38

, WEB INDEX, 2014.

. Zaharia, Resilient Distributed Datasets, Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, NSDI'12, pp.2-2, 2012.
DOI : 10.1145/2886107.2886110

. Zhang, Eagre: Towards scalable I/O efficient sparql query evaluation on the cloud, 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp.565-576, 2013.

. Zhang, On the Satisfiability Problem for SPARQL Patterns, Journal of Artificial Intelligence Research, vol.56, issue.1, pp.403-428, 2016.
DOI : 10.1613/jair.5028

. Zou, gStore, Proc. VLDB Endow, pp.482-493, 2011.
DOI : 10.14778/2002974.2002976

, & ( revieww ( X ) = > ( ( is (X , Y ) = > Y = review ) & ( ~ is_max2 (X , Y ) ) ) ) & ( reviewerr ( X ) = > ( ( is

A. Listing, 1: Schema 1: Encoding into FOF TPTP Syntax as Axioms 1 fof ( schema1 , axiom

, ) ) & ( name (X , Y ) = > string ( Y ) ) & ( ~ name_max2 (X , Y ) ) ) ) ) ) )

A. Listing, Encoding into FOF TPTP Syntax as Axioms 1 fof ( schema1 , axiom

, ) ) & ( name (X , Y ) = > string ( Y ) ) ) ) ) ) )

A. Listing, Schema 3: Encoding into FOF TPTP Syntax as Axioms 1 fof ( schema1 , axiom

, & ( p r o d u c t P r o p e r t y N u m e r i c 2 (X , Y ) = > property Numericc ( Y ) ) & ( ~ p r o d u c t P r o p e r t y N u m e r i c 2 _ m a x 2, ) ) & ( name (X , Y ) = > string ( Y ) ) ) ) & ( propertyNumericc ( X ) = > ( ( is (X , Y ) = > Y = propertyNumeric ) & ( ~ is_max2 (X , Y ) ) & ( value (X , Y ) = > integer ( Y ) ) & ( ~ value_max2 (X , Y ) ) ) ) ) ) )

A. Listing, pname ? x5 . 10 ? x6 ex : phasType ? x7 . 11 ? x4 ex : pisSubclassOf ? x7 . } Listing B.2: Query (Q2) ? x0 , ? x3 , ? x2 , ? x1 WHERE { 2 ? x1 ex : pname ? x0 . 3 ? x2 ex : phasType ? x1 . 4 ? x3 ex : phasInterest ? x2 . 5 ? x3 ex : pbirthday ? x4 . 6 ? x5 ex : pcreationDate ? x4 . 7 ? x5 ex : pisLocatedIn ? x6 . 8 ? x6 ex : pname ? x7 . 9 ? x8 ex : pname ? x7 . 10 ? x9 ex : pisPartOf ? x8 Query (Q3) 2 ? x1 ex : pemail ? x0 . 3 ? x1 ex : plikes ? x2 . 4 ? x2 ex : pisLocatedIn ? x3 . 5 ? x3 ex : pisPartOf ? x4 . 6 ? x4 ex : pname ? x5 . 7 ? x6 ex : pname ? x5 . 8 ? x7 ex : pstudyAt ? x6 . 9 ? x7 ex : pisLocatedIn ? x8 Query (Q4) 1 SELECT ? x2 , ? x1 , ? x4 , ? x0 , ? x3 WHERE { 2 ? x0 ex : phasMember ? x9 . 3 ? x9 ex : pname ? x1 . 4 ? x10 ex : phasModerator ? x1 . 5 ? x2 ex : pspeaks ? x10 . 6 ? x2 ex : phasModerator ? x11 . 7 ? x11 ex : pknows ? x3 . 8 ? x3 ex : pknows ? x4 . 9 ? x5 ex : pisLocatedIn ? x0 . 10 ? x5 ex : pgender ? x6 . 11 ? x7 ex : pspeaks ? x6 . 12 ? x8 ex : phasMember ? x7 . 13 ? x8 ex : phasModerator ? x4 . } Listing B.5: Query (Q5) 1 SELECT ? x4 , ? x2 , ? x3 , ? x5 , ? x0 , ? x1 WHERE { 2 ? x1 ex : pname ? x0 . 3 ? x1 ex : pname ? x2 . 4 ? x3 ex : pname ? x2 . 5 ? x4 ex : pisPartOf ? x3 . 6 ? x4 ex : pisPartOf ? x5 . 7 ? x5 ex : pname ? x6 . 8 ? x7 ex : pgender ? x6 . 9 ? x7 ex : pgender ? x8 . 10 ? x9 ex : pname ? x8 . 11 ? x9 ex : pname ? x10 Query (Q6) 1 SELECT ? x3 , ? x4 , ? x5 , ? x2 , ? x0 , ? x1 WHERE { 2 ? x1 ex : pworksAt ? x0 . 3 ? x1 ex : pstudyAt ? x2 . 4 ? x2 ex : pname ? x3 . 5 ? x4 ex : pname ? x3 . 6 ? x4 ex : pname ? x5 . 7 ? x6 ex : pname ? x5 . 8 ? x6 ex : plocationIP ? x7 . 9 ? x8 ex : pbrowserUsed ? x7 Query (Q7) B. Appendix: Chapter 10 Experiments 1 SELECT ? x2 , ? x1 , ? x3 , ? x0 WHERE { 2 ? x0 ex : pisLocatedIn ? x1 . 3 ? x2 ex : pisPartOf ? x1 . 4 ? x3 ex : pisLocatedIn ? x2 . 5 ? x3 ex : pgender ? x4 . 6 ? x5 ex : pname ? x4 . 7 ? x5 ex : pname ? x6 Query (Q8) 1 SELECT ? x0 , ? x3 , ? x2 , ? x1 , ?, Encoding into FOF TPTP Syntax as Axioms B Appendix: Chapter 10, p.Query