OKlibrary  0.2.1.6
SetSystems.hpp File Reference

Plans for Maxima-components for systems of sets. More...

Go to the source code of this file.


Detailed Description

Plans for Maxima-components for systems of sets.

Todo:
New module ComputerAlgebra/Sets
Todo:
Naming conventions
  • We need abbreviations for "set systems" (or "sets of sets"), "lists of sets" etc.
  • Perhaps simply "setset" and "listset".
  • But this is all too long --- just "ls" and "ss".
  • DONE Then "set_sets_union" should become "union_setset" etc.
Todo:
Unions and intersections
  • Choosing between the different methods to apply union iteratively:
    1. We have two basic extreme test cases: Disjoint union (by create_list({i},i,1,N)), and all sets identical (by create_list(S,i,1,N) for S : setn(n)).
    2. Basically, apply is faster by a factor of 2 for the second case, while tree_reduce is faster (much faster, growing with N) in the first case.
    3. First case with "apply":
      test1_apply(N) := length(uaapply(union,create_list({i},i,1,N)));
           
    4. With Ecl: N = 1000 around 0.8 s, with N = 10000 around 73 s.
    5. With CLisp: N = 1000 around 1.3 s, with N = 10000 around 46 s.
    6. First case with "tree_reduce":
      test1_tree(N) := length(tree_reduce(union,create_list({i},i,1,N)));
           
    7. With Ecl: N = 1000 around 0.1 s, with N = 10000 around 1.2 s.
    8. With CLisp: N = 1000 around 0.2 s, with N = 10000 around 2.2 s.
    9. Second case with "apply":
      test2_apply(N,n) := block([S : setn(n)], length(uaapply(union,create_list(S,i,1,N))));
           
    10. With Ecl: N=1000,n=10 around 0.03 s, with N=10000,n=10 around 0.3 s, with N=1000,n=100 around 0.35 s, with N=10000,n=100 around 3.5 s.
    11. With CLisp: N=1000,n=10 around 0.05 s, with N=10000,n=10 around 0.56 s, with N=1000,n=100 around 0.79 s, with N=10000,n=100 around 7.96 s.
    12. Second case with "tree_reduce":
      test2_tree(N,n) := block([S : setn(n)], length(tree_reduce(union,create_list(S,i,1,N))));
           
    13. With Ecl: N=1000,n=10 around 0.05 s, with N=10000,n=10 around 0.56 s, with N=1000,n=100 around 0.64 s, with N=10000,n=100 around 6.2 s.
    14. With CLisp: N=1000,n=10 around 0.1 s, with N=10000,n=10 around 1.05 s, with N=1000,n=100 around 1.52 s, with N=10000,n=100 around 15.7 s.
    15. Some "mixed cases":
      var_cs_apply(F) := var_sl(uaapply(union,listify(F)))$
      var_cs_tree(F) := var_sl(tree_reduce(union,F))$
           
    16. With Ecl: For weak_php_fcs(n,n) no difference, while with full_fcs(n)[2] "apply" is by 50% faster than "tree". On the other hand, for van-der-Waerden problems "tree" is with a growing factor faster (2 s instead of 15.5 s for vanderwaerden2_fcs(5,300)[2]).
    17. So it seems that with tree_reduce one is on the safe side.
Todo:
Computing all ordered tuples
  • The implementations of all_ord_tuples and all_ord_tuples_l are rather inefficient (especially for larger k).
  • One possibility would be to use use meta-programming, that is, writing a macro which creates the k-times nested loop.
Todo:
Set creation
  • It would be nice to have "create_set", as we have "create_list".
  • However
    create_set([A]) := setify(apply(create_list,A))$
       
    does not work, since "apply" evaluates the argument A.
  • Or, perhaps it works? I (OK) can't find an example where it does not work?
  • It should be possible to make this work, but perhaps this is not worth the effort?
  • Ask on the Maxima mailing list.
  • This should belong to ComputerAlgebra/Sets.
Todo:
Function min_elements
  • Variation:
    1. A similar idea is in [Lintao Zhang, On Subsumption Removal and On-the-Fly {CNF} Simplification, SAT 2005], under "forward subsumption checking".
    2. The task is to check for a given C whether there is D <= C.
    3. There however one runs through all the literals in C and all the clauses in their watched-lists. Thus the algorithms there is rather different.
    4. It is done there this way to avoid the subsumption check. This is achieved by moving the clauses which yet do not subsume C to the next watched-list. This works similar to setting all literals in C to false, and checking whether we obtain the empty clause.
    5. We need to implement this, and compare it with our implementation. It seems to me (OK), that our implementation should be faster in most cases.
  • Currently just choose_element_ = first. A random choice should be better.
  • Perhaps for small lists the simpler algorithms are faster.
  • Upper bound on the time-complexity:
    1. Let m be the number of sets, and let k be the maximal size of a set.
    2. The trivial algorithm runs in time O(m^2*k).
    3. Can we prove a better bound for the improved algorithm?
Todo:
Function max_elements
  • Can we transfer the idea of min_elements_unique_fast_l_ ?
Todo:
Minimal elements for convex set-systems
  • Consider a set-system S with the property, that for A, B in S with A <= B also for all A <= C <= B we have C in S.
  • Then subsumption-elimination can be done by sorting and binary search as follows:
    1. Let m := |S| and let n := union(S) (the size of the base-set).
    2. Copy S into an array A and sort A according to any linear ordering on sets.
    3. Now s in S is minimal iff for all x in s the set s-{x} is not in A, where (non-)elementship can be determined in time log_2(m)*n by binary search.
    4. Thus computation of the minimal elements of S can be done in time O(m*log_2(m)*n^2) (assuming comparison of two sets can be done in time O(n), and thus sorting of A can be done in time O(m*log_2(m)*n)).
    5. Similarly the maximal elements of S can be computed.

Definition in file SetSystems.hpp.