Cardinality Estimation of an SQL Query Using Recursive Neural Networks


  • Davit S. Karamyan Yerevan State University



Recursive neural networks, Representation learning, Cardinality estimation, Compositionality, Complex logical expressions


To learn complex interactions between predicates and accurately estimate the cardinality of an SQL query, we develop a novel framework based on recursive tree-structured neural networks, which take into account the natural properties of logical expressions: compositionality and n-ary associativity. The proposed architecture is an extension of MSCN (multi-set convolutional network) for queries containing both conjunction and disjunction operators. The goal is to represent an arbitrary logical expression in a continuous vector space by combining sub-expression vectors according to the operator type. We compared the proposed approach with the histogram-based approach on the real-world dataset and showed that our approach significantly outperforms histograms with a large margin.


A. Kipf, T. Kipf, B. Radke, V. Leis, P. Boncz and A. Kemper, Learned Cardinalities: Estimating Correlated Joins with Deep Learning, arXiv preprint arXiv:1809.00677, 2018.

S. Krishnan, Z. Yang, K. Goldberg, J. Hellerstein and I. Stoica, Learning to Optimize Loin Queries with Deep Reinforcement Learning, arXiv preprint arXiv:1808.03196, 2018.

R. Marcus and O. Papaemmanouil, “Deep reinforcement learning for join order enumeration”, In Proceedings of the First International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, pp. 1-4, 2018.

J. Ortiz, M. Balazinska, J. Gehrke, and S. S. Keerthi, “Learning state representations for query optimization with deep reinforcement learning”, In Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning, pp. 1-4, 2018.

J. Ortiz, M. Balazinska, J. Gehrke, and Sathiya S, An empirical analysis of deep learning for cardinality estimation, arXiv preprint arXiv:1905.06425, 2019.

Z. Yang, E. Liang, A. Kamsetty, C. Wu, Y. Duan, X. Chen, P. Abbeel, J. M. Hellerstein, S. Krishnan, and I. Stoica, Deep unsupervised cardinality estimation, arXiv preprint arXiv:1905.04278, 2019.

T. Kraska, A. Beutel, E. H. Chi, J. Dean and N. Polyzotis, “The case for learned index structures”, In Proceedings of the 2018 International Conference on Management of Data, pp. 489-504, 2018.

C. Goller and A. Kuchler, “Learning task-dependent distributed representations by backpropagation through structure”, In Proceedings of International Conference on Neural Networks (ICNN96), IEEE,vol. 1, pp. 347-352, 1996.

G. Moerkotte, T. Neumann, and G. Steidl, “Preventing bad plans by bounding the impact of cardinality estimation errors”,Proceed- ings of the VLDB Endowment, vol. 2, no. 1, pp. 982-993, 2009.

O. Ivanov and S. Bartunov, Adaptive cardinality estimation, arXiv preprint arXiv:1711.08330, 2017.

Y.Wang, C.Xiao, J.Qin, X.Cao, Y.Sun, W.Wang, and M.Onizuka, “Monotonic cardinality estimation of similarity selection: A deep learning approach”, In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 1197-1212, 2020.

R. Hayek and O. Shmueli, Nn-based transformation of any sql cardinality estimator for handling distinct, and, or and not, arXiv preprint arXiv:2004.07009, 2020.

R.Marcusand O.Papaemmanouil, Flexible operator embeddings via deep learning, arXiv preprint arXiv:1901.09090, 2019.

O. Irsoy and C. Cardie, “Deep recursive neural networks for compositionality in language”, In Advances in Neural Information Processing Systems, pp. 2096-2104, 2014.

R. Socher, J. Pennington, E. H. Huang, A. Y. Ng and C. D. Manning, “Semi-supervised recursive autoencoders for predicting sentiment distributions, In Proceedings of the 2011 conference on empirical methods in natural language processing, pp. 151-161, 2011.

R. Socher, E. H. Huang, J. Pennin, C. D. Manning and A. Y. Ng, “Dynamic pooling and unfolding recursive autoencoders for paraphrase detection”, In Advances in Neural Information Processing Systems, pp. 801-809, 2011.

I. Dagan, O. Glickman and B. Magnini, “The pascal recognising textual entailment challenge”, in Machine Learning Challenges Workshop, Springer, pp. 177-190, 2005.

S. Bowman, C. Potts and C. D. Manning, “Recursive neural networks can learn logical semantics”, In Proceedings of the 3rd workshop on continuous vector space models and their compositionality, pp. 12-21, 2015.

M. Allamanis, P. Chanthirasegaran, P. Kohli and C. Sutton, “Learning continuous semantic representations of symbolic expressions”, In International Conference on Machine Learning, pp. 80-88, 2017.

M. Zaheer, S. Kottur, S. Ravanbakhsh, B. Poczos, R. R. Salakhutdinov and A. J. Smola, “Deep sets”, In Advances in Neural Information Processing Systems, pp. 3391- 3401, 2017.

D.P.Kingma and J.Ba, Adam:A method for stochastic optimization, arXiv preprint arXiv:1412.6980, 2014




How to Cite

Karamyan, D. S. . (2021). Cardinality Estimation of an SQL Query Using Recursive Neural Networks. Mathematical Problems of Computer Science, 54, 41–52.