Queries about the largest empty rectangle in large 2-dimensional datasets stored in secondary memory
Consultas sobre el rectángulo vacío de mayor área en grandes conjuntos de datos de dos dimensiones, almacenados en memoria secundaria
Keywords:
geometric algorithms, spatial databases, geometric problems (en)Algoritmos geométricos, bases de datos espaciales, problemas geométricos. (es)
Let be a set of points located in a rectangle and is a point that is not in . This article describes the design, implementation, and experimentation of different algorithms to solve the following two problems: (i) Maximum Empty Rectangle (MER), which consists in finding an empty rectangle with a maximum area contained in R and does not contain any point from and (ii) Query Maximum Empty Rectangle (QMER), which consists in finding the rectangle with the same restrictions given for the MER problem but must also contain . It is assumed that both problems have insufficient main memory to store all the objects in set . According to the literature, both problems are very practical in fields such as data mining and Geographic Information Systems (GIS). Specifically, the present study proposes two algorithms that assume that is stored in secondary memory (mainly disk) and that it is impossible to store it completely in main memory. The first algorithm solves the QMER problem and consists of decreasing the size of S by using dominance areas and then processing the points that are not eliminated using an algorithm proposed by Orlowski (1990). The second algorithm solves the MER problem and consists of dividing R into four subrectangles that generate four subsets of similar size; these are processed using an algorithm proposed in Edmons et al. (2003), and finally the partial solutions are combined to obtain a global solution. For the purpose of verifying algorithm efficiency, results are shown for a series of experiments that consider synthetic and real data.
References
Aggarwal, A., & Suri, S. (1987, October). Fast algorithms for computing the largest empty rectangle. In Proceedings of the third annual symposium on Computational geometry (pp. 278-290). ACM.
Augustine, J., Das, S., Maheshwari, A., Nandy, S. C., Roy, S., & Sarvattomananda, S. (2010). Recognizing the largest empty circle and axis-parallel rectangle in a desired location. arXiv preprint arXiv:1004.0558.
Augustine, J., Das, S., Maheshwari, A., Nandy, S., Roy, S., & Sarvattomananda, S. (2010). Querying for the largest empty geometric object in a desired location. arXiv preprint arXiv:1004.0558.
Blott S, Weber R (1997) A simple vector-approximation fi le for similarity search in high-dimensional vector spaces. Technical report, Institute of Information Systems, ETH Zentrum, Zurich, Switzerland
Böhm, C., & Kriegel, H. P. (2001, September). Determining the convex hull in large multidimensional databases. In International Conference on Data Warehousing and Knowledge Discovery (294-306). Springer Berlin
Heidelberg.
Chazelle, B., Drysdale, R. L., & Lee, D. T. (1986). Computing the largest empty rectangle. SIAM Journal on Computing, 15(1), 300-315.
Corral, A., Manolopoulos, Y., Theodoridis, Y., & Vassilakopoulos, M. (2004). Algorithms for processing K-closest-pair queries in spatial databases. Data & Knowledge Engineering, 49(1), 67-104.
Corral, A., Manolopoulos, Y., Theodoridis, Y., & Vassilakopoulos, M. (2006). Cost models for distance joins queries using R-trees. Data & Knowledge Engineering, 57(1), 1-36.
De, M., & Nandy, S. C. (2011). Inplace algorithm for priority search tree and its use in computing largest empty axisparallel rectangle. arXiv preprint arXiv:1104.3076.
De, M., & Nandy, S. C. (2011). Space-efficient Algorithms for Empty Space Recognition among a Point Set in 2D and 3D. In CCCG.
Edmonds, J., Gryz, J., Liang, D., & Miller, R. J. (2003). Mining for empty spaces in large data sets. Theoretical Computer Science, 296(3), 435-452.
Gutiérrez, G., & Paramá, J. R. (2012, June). Finding the largest empty rectangle containing only a query point in large multidimensional databases. In International Conference on Scientific and Statistical Database Management (pp. 316-333). Springer Berlin Heidelberg.
Gutiérrez, G., Paramá, J. R., Brisaboa, N., & Corral, A. (2014). The largest empty rectangle containing only a query object in Spatial Databases. GeoInformatica, 18(2), 193-228.
Guttman, A. (1984). R-trees: a dynamic index structure for spatial searching, 14(2), 47-57. ACM.
Kaplan, H., Mozes, S., Nussbaum, Y., & Sharir, M. (2012, January). Submatrix maximum queries in Monge matrices and Monge partial matrices, and their applications. In Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms (pp. 338-355). SIAM.
Lara, F. (2014) Implementación en Java de algoritmos geométricos sobre grandes conjuntos de datos. In: Memoria de Titulo, Ingeniería Civil en Informática, Universidad del Bío-Bío.
Naamad, A., Lee, D. T., & Hsu, W. L. (1984). On the maximum empty rectangle problem. Discrete Applied Mathematics, 8(3), 267-277.
Nandy, S. C., & Bhattacharya, B. B. (1998). Maximal empty cuboids among points and blocks. Computers & Mathematics with Applications, 36(3), 11-20.
Orlowski, M. (1990). A new algorithm for the largest empty rectangle problem. Algorithmica, 5(1-4), 65-73.
Roussopoulos, N., Kelley, S., & Vincent, F. (1995, June). Nearest neighbor queries. In ACM sigmod record, 24(2), 71-79. ACM.
License
Copyright (c) 2017 Felipe Lara, Gilberto Gutiérrez, Maria Antonieta Soto, Antonio Corral

This work is licensed under a Creative Commons Attribution 4.0 International License.
The authors or holders of the copyright for each article hereby confer exclusive, limited and free authorization on the Universidad Nacional de Colombia's journal Ingeniería e Investigación concerning the aforementioned article which, once it has been evaluated and approved, will be submitted for publication, in line with the following items:
1. The version which has been corrected according to the evaluators' suggestions will be remitted and it will be made clear whether the aforementioned article is an unedited document regarding which the rights to be authorized are held and total responsibility will be assumed by the authors for the content of the work being submitted to Ingeniería e Investigación, the Universidad Nacional de Colombia and third-parties;
2. The authorization conferred on the journal will come into force from the date on which it is included in the respective volume and issue of Ingeniería e Investigación in the Open Journal Systems and on the journal's main page (https://revistas.unal.edu.co/index.php/ingeinv), as well as in different databases and indices in which the publication is indexed;
3. The authors authorize the Universidad Nacional de Colombia's journal Ingeniería e Investigación to publish the document in whatever required format (printed, digital, electronic or whatsoever known or yet to be discovered form) and authorize Ingeniería e Investigación to include the work in any indices and/or search engines deemed necessary for promoting its diffusion;
4. The authors accept that such authorization is given free of charge and they, therefore, waive any right to receive remuneration from the publication, distribution, public communication and any use whatsoever referred to in the terms of this authorization.










