HashMaps.hpp File Reference

Plans for Maxima-components regarding maps. More...

Go to the source code of this file.

Detailed Description

Plans for Maxima-components regarding maps.

Introduce memoise function wrapper
  • Often we wish to call the same function many times in many different places with the same arguments, and the result takes a considerable amount of time to compute.
  • A solution to this is to store the result in a hash table where the key is the arguments to the function.
  • Such a solution can easily be written into a single function (say memoise_f) which given a function name, and arguments to that function:
    • returns the result of that function from a global hash_table if the hash key occurs.
    • computes the result and then stores the value with the hash key for later re-use otherwise.
    where the hash key is the list with the function name and arguments.
  • It seems better to take an empty hash table as an argument to memoise_f rather than keeping a global hash_table, so that any other function memoise_f has this hash_table properly scoped and can destroy it is no longer needed.
  • See "Functions should not cache return values" in ComputerAlgebra/Cryptology/Lisp/Cryptanalysis/Rijndael/plans/general.hpp .
Using the Maxima "associative arrays"
  • One could try an alternative implementation of hash-maps via the associative arrays provided by Maxima.
  • Testing the speed:
    test1(n) := block([N : setn(n), P, MS, c0, h, t],
      P : powerset(N),
     MS : powerset(P),
     c0 : lambda([S], setdifference(P,S)),
     t : elapsed_run_time(),
     h : osm2hm(ll2osm(listify(MS),create_list(i,i,1,2^(2^n)))),
     map(lambda([S],ev_hm(h,c0(S))), listify(MS))
    test2(n) := block([N : setn(n), P, MS, c0, h, t],
      P : powerset(N),
     MS : powerset(P),
     c0 : lambda([S], setdifference(P,S)),
     t : elapsed_run_time(),
     block([i : 1],
       for S in MS do (h[S] : i, i : i+1)
     map(lambda([S],h[c0(S)]), listify(MS))
  • test2 is much faster, but h becomes a global variable.
  • So one needs to create a new variable-name to be used here.
  • Using "local" prevents h from becoming a global variable:
    (%i21) f(x) := block([a : 5], b[1] : a * x)$
    (%i22) f(1);
    (%o22) 5
    (%i23) b[1];
    (%o23) 5
    (%i24) f(x) := block([a : 5], local(c), c[1] : a * x)$
    (%i25) f(4);
    (%o25) 20
    (%i26) c[1];
    (%o26) c[1]
New naming conventions
  • Rename old functions as indicated in ComputerAlgebra/DataStructures/Lisp/HashMaps.mac.
  • The general abbreviations are:
    1. "mp" for map (with one argument), with "lmp" for a "lambda map", that is, a map realised by a lambda-term, and with "lmpa" for a lambda map with build-in array
    2. "sm" for set-map
    3. "hm" for hash-map
    4. "ary" for okl-arrays
    5. "l" for list, with "il" for integer list, and "fl" for floating point list
    6. "s" for set
Concept of a set-theoretical map
  • All this should go to ComputerAlgebra/Sets.
  • DONE (see ComputerAlgebra/DataStructures/Lisp/HashMaps.mac) We should institutionalise the concept of a "set-theoretic" map as a set of pairs.
  • Operations:
    1. DONE check for right-uniqueness
    2. extract domain
    3. extract range
    4. DONE compute value for a single argument
    5. compute value set for a set of arguments
    6. composition of partial and of total maps
  • We need the same functionality also for ordered set-maps.
  • We also should have the concept of a (set-theoretical) relation as set of pairs!
    1. Perhaps then we create a new sub-module "Relations.mac", which also contains the set-maps.
  • The alternative recursive implementation of allinj_sm
    allinj_sm(A,B) := if length(A) > length(B) then {} 
     elseif emptyp(A) then {{}}
      block([a : first_element(A), RA],
       RA : disjoin(a,A),
          map(lambda([f],adjoin([a,b],f)), allinj_sm(RA,disjoin(b,B))),
          b, listify(B))))$
    should be mentioned in the documentation.
Improving the hash-maps
  • Determining whether key x is contained in hash-map h:
    1. Yet we can use ev_hm(h,x) if values are never "false".
    2. Otherwise one has to use a value y which can never occur, and one has to check via ev_hm_d(h,x,y).
    3. So well, should be doable, but it would be nice to have a direct way of querying whether x is already present.
  • Perhaps we provide specialised hash-maps for cases where we know a good hash-function for the key, and then we do not need to use the conversion to strings:
    1. Especially for integer arguments this seems interesting.
    2. The underlying Lisp-functions seem to have already the ability to use different hash-functions; we need to understand this better.
Improving the okl-arrays
  • If "lambda" becomes abbreviated by "lm", why do we have the names "ary2lmp2", "lmpa2ary" and "lmpa2l" ?
  • "p2" seems to be "part 2" (for the extraction).
  • But also "pa2" seems to mean "part 2", so we should decide which form to choose.
  • Compare "Conversions" in ComputerAlgebra/CombinatorialMatrices/Lisp/plans/Basics.hpp.
DONE (removed the additional assumptions and tests) False assumption for testing hash-maps
  • The test-functions eq_ohmsm_p and eq_hmsm_p assume that two hash-maps, which are equal as set-maps, yield the same strings, but this is not true for ecl, where the strings are not, as with clisp, a translation of the set-maps, but just contain (apparently) a memory address.
  • More precisely, with clisp the hash-maps as strings contains the list of assignments as well as the annotation-string (if present), while with ecl none of this information is contained, but a reference to the underlying object.
  • DONE (we drop the tests --- they were created when we didn't fully understand the issues, and thus became overly cautious) The simplest solution is to abandon these extra tests. Or one could distinguish here between clisp and ecl.
  • DONE (we don't make assumptions on the order of entries in a hash-map) Apparently one must treat the order of the pairs in the hash-map as implementation-defined (so that one shouldn't use it).

Definition in file HashMaps.hpp.