Invalidating a Conjecture on the Average Number of Closed Sets in a Random Database

In this paper, we invalidate a conjecture from [3] which stated that given a random database with n columns and m lines, the average number of closed sets of size x with a support y is maximized when x = log n and y = log m. We prove a refinement of this conjecture and obtain that x = log m − log log log m + O(1) and y = log n − log log log n + O(1). From there we obtain a estimation of the average number of closed sets in a random database.

Keywords: Average analysis, closed sets, data mining.