# Cardinality Estimation of an SQL Query Using Recursive Neural Networks

## DOI:

https://doi.org/10.51408/1963-0058## Keywords:

Recursive neural networks, Representation learning, Cardinality estimation, Compositionality, Complex logical expressions## Abstract

To learn complex interactions between predicates and accurately estimate the cardinality of an SQL query, we develop a novel framework based on recursive tree-structured neural networks, which take into account the natural properties of logical expressions: compositionality and n-ary associativity. The proposed architecture is an extension of MSCN (multi-set convolutional network) for queries containing both conjunction and disjunction operators. The goal is to represent an arbitrary logical expression in a continuous vector space by combining sub-expression vectors according to the operator type. We compared the proposed approach with the histogram-based approach on the real-world dataset and showed that our approach significantly outperforms histograms with a large margin.

## References

A. Kipf, T. Kipf, B. Radke, V. Leis, P. Boncz and A. Kemper, Learned Cardinalities: Estimating Correlated Joins with Deep Learning, arXiv preprint arXiv:1809.00677, 2018.

S. Krishnan, Z. Yang, K. Goldberg, J. Hellerstein and I. Stoica, Learning to Optimize Loin Queries with Deep Reinforcement Learning, arXiv preprint arXiv:1808.03196, 2018.

R. Marcus and O. Papaemmanouil, “Deep reinforcement learning for join order enumeration”, In Proceedings of the First International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, pp. 1-4, 2018.

J. Ortiz, M. Balazinska, J. Gehrke, and S. S. Keerthi, “Learning state representations for query optimization with deep reinforcement learning”, In Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning, pp. 1-4, 2018.

J. Ortiz, M. Balazinska, J. Gehrke, and Sathiya S, An empirical analysis of deep learning for cardinality estimation, arXiv preprint arXiv:1905.06425, 2019.

Z. Yang, E. Liang, A. Kamsetty, C. Wu, Y. Duan, X. Chen, P. Abbeel, J. M. Hellerstein, S. Krishnan, and I. Stoica, Deep unsupervised cardinality estimation, arXiv preprint arXiv:1905.04278, 2019.

T. Kraska, A. Beutel, E. H. Chi, J. Dean and N. Polyzotis, “The case for learned index structures”, In Proceedings of the 2018 International Conference on Management of Data, pp. 489-504, 2018.

C. Goller and A. Kuchler, “Learning task-dependent distributed representations by backpropagation through structure”, In Proceedings of International Conference on Neural Networks (ICNN96), IEEE,vol. 1, pp. 347-352, 1996.

G. Moerkotte, T. Neumann, and G. Steidl, “Preventing bad plans by bounding the impact of cardinality estimation errors”,Proceed- ings of the VLDB Endowment, vol. 2, no. 1, pp. 982-993, 2009.

O. Ivanov and S. Bartunov, Adaptive cardinality estimation, arXiv preprint arXiv:1711.08330, 2017.

Y.Wang, C.Xiao, J.Qin, X.Cao, Y.Sun, W.Wang, and M.Onizuka, “Monotonic cardinality estimation of similarity selection: A deep learning approach”, In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 1197-1212, 2020.

R. Hayek and O. Shmueli, Nn-based transformation of any sql cardinality estimator for handling distinct, and, or and not, arXiv preprint arXiv:2004.07009, 2020.

R.Marcusand O.Papaemmanouil, Flexible operator embeddings via deep learning, arXiv preprint arXiv:1901.09090, 2019.

O. Irsoy and C. Cardie, “Deep recursive neural networks for compositionality in language”, In Advances in Neural Information Processing Systems, pp. 2096-2104, 2014.

R. Socher, J. Pennington, E. H. Huang, A. Y. Ng and C. D. Manning, “Semi-supervised recursive autoencoders for predicting sentiment distributions, In Proceedings of the 2011 conference on empirical methods in natural language processing, pp. 151-161, 2011.

R. Socher, E. H. Huang, J. Pennin, C. D. Manning and A. Y. Ng, “Dynamic pooling and unfolding recursive autoencoders for paraphrase detection”, In Advances in Neural Information Processing Systems, pp. 801-809, 2011.

I. Dagan, O. Glickman and B. Magnini, “The pascal recognising textual entailment challenge”, in Machine Learning Challenges Workshop, Springer, pp. 177-190, 2005.

S. Bowman, C. Potts and C. D. Manning, “Recursive neural networks can learn logical semantics”, In Proceedings of the 3rd workshop on continuous vector space models and their compositionality, pp. 12-21, 2015.

M. Allamanis, P. Chanthirasegaran, P. Kohli and C. Sutton, “Learning continuous semantic representations of symbolic expressions”, In International Conference on Machine Learning, pp. 80-88, 2017.

M. Zaheer, S. Kottur, S. Ravanbakhsh, B. Poczos, R. R. Salakhutdinov and A. J. Smola, “Deep sets”, In Advances in Neural Information Processing Systems, pp. 3391- 3401, 2017.

D.P.Kingma and J.Ba, Adam:A method for stochastic optimization, arXiv preprint arXiv:1412.6980, 2014

## Downloads

## Published

## How to Cite

*Mathematical Problems of Computer Science*,

*54*, 41–52. https://doi.org/10.51408/1963-0058

## Issue

## Section

## License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.