general.hpp File Reference

On investigations regarding Sudoku problems (standard, and extended) More...

Go to the source code of this file.

Detailed Description

On investigations regarding Sudoku problems (standard, and extended)

Besides experimentations in creating and solving big Sudoku problems (with or without the uniqueness constraint) one should harness the popularity of Sudoku to teach (generalised) satisfiability.

Translations to boolean CNFs based on the direct encoding
  • In ComputerAlgebra/Satisfiability/Lisp/Generators/Sudoku.mac we provide three translations:
    1. output_weak_sdk_stdname
    2. output_dual_weak_sdk_stdname
    3. output_strong_sdk_stdname
  • In general we have four types of clauses for the direct encoding:
    1. primary binary clauses: different numbers per line
    2. primary long clauses: every field gets a number
    3. dual binary clauses: at most one number per field
    4. dual long clauses: every line contains all numbers
  • Thus we get 2^4=16 possible combinations, but not all combinations are valid:
    1. Long clauses are needed, otherwise no field getting a number would be a solution.
    2. And at least all primary or all dual clauses are needed.
    3. So we have 3 + 3 + 1 = 7 valid possibilities.
    4. How to call them? The primary clauses (only) has been called already "weak", only the dual clauses "dual weak", and primary and dual together "strong". Remain the 4 mixed cases.
    5. Perhaps "weak pb", "weak pl", "dual weak pb" and "dual weak pl" ("p" for "plus") ?
  • Previous experimentations would suggest that "weak pl" is a good combination.
Finding hard instances for box-dimension 3
  • A source for "hard" instances is [extreme Su Doku; Wayne Gould].
    1. The cover-text says "... Wayne Gould, who'll take you deep inside his genius psyche.", and "..., he has been named by Time Magazine as one of the 100 people who shaped the world.".
    2. One sees that enlightenment is needed.
    3. As one can see in ComputerAlgebra/Satisfiability/Lisp/Generators/docus/Sudoku.hpp, apparently most problems only need one r_2-reduction, while perhaps the most difficult instance in that collection needs five r_2-reductions (without backtracking, and thus also uniqueness is confirmed).
    4. So likely this source is without further interest.
  • A better source is [Denis Berthier, The Hidden Logic of Sudoku, 2007]
    1. See the collections at http://www.carva.org/denis.berthier .
    2. This source only consider these "humanoid rules" (AI-nonsense).
    3. Perhaps all these rules are covered by r_2 for the strong encoding?
    4. In principle the all-different constraint can not be covered, but perhaps this doesn't happen for these small problems (box-dimension 3)?
  • We should search for problems which are as "hard" as possible.
    1. See "Sampling of minimally uniquely satisfiable problems" in ComputerAlgebra/Satisfiability/Lisp/Generators/plans/Sudoku.hpp.
    2. The "EasterMonster" (see Satisfiability/Lisp/Generators/docus/Sudoku.hpp) needs 9 nodes with the OKsolver_2002; this is the hardest example yet encountered.
    3. The right approach should be to compute the t-hardness, p-hardness and w-hardness!
      1. For uniquely satisfying solutions these hardnesses are given by the hardness of the unsatisfiable problems given by setting (just) one variable to a wrong value.
      2. This needed to be computed for all known instances.
      3. p-hardness 1 means that by directly applying the most basic rules all fields are filled.
      4. p-hardness 2 and hardness 1 means that one needs to fill in one at least one false number, and then apply "basic" reasoning.
      5. Note that in general a hardness value k is not just the worst-case, but also determines the best-case for reasoning: by setting or forbidding values for fields (i.e., assigning a value to a variable) we need precisely r_{k-1} to refute the false values and solve all fields (one after another).
      6. See "Hardness of Sudoku problems" in Investigations/BooleanFunctions/plans/Hardness/general.hpp.
      7. This should replace the following older discussion, based on solver-runs.
    4. There should be harder *unsatisfiable* problems.
    5. While for the uniquely satisfiable problems actually it would be good if the OKsolver could enumerate all solution --- since here this would yield the proof of uniqueness, and only this would be comparable to the rule-based approach, which also always shows uniqueness.
    6. At least we need to extract the solution, add it negated to the conditions, and see what the complexity of the refutation is.
    7. This refutation also contains the direct proof that the given total assignment is the unique solution.
    8. Perhaps it is then enough to search for short (tree-like) resolution refutations of the original sdk-clause-set ("empty field") plus the negated (total) satisfying assignment.
    9. The refutation of the "EasterMonster" with negated solution actually doesn't get any harder than finding the solution; see Satisfiability/Lisp/Generators/docus/Sudoku.hpp.
    10. See below for the "human editing".
  • For the "human" solution of such problems, "explanations" are needed for the solution, which is just a demonstration of the sequence of forced assignments.
    1. In [extreme Su Doku; Wayne Gould] the condition "without reductio ad absurdum" is stated. Can this be made precise?
    2. In [Denis Berthier The Hidden Logic of Sudoku, 2007] one finds a more systematic approach, and also an attempt at defining what "without guessing" could mean.
    3. Perhaps r_2-reduction would not be eligible here, and perhaps one should use full-strength bijectivity-constraints?
    4. Of course, one can simulate r_2-reduction via resolution, which is "forward reasoning"; so perhaps one should search for short resolution derivations of the forced assignments.
    5. Actually, the last remark seems crucial, showing that all that talk about "trial and error" and so on is nonsense: Every forced assignment can be demonstrated by r_k for k big enough, and this "trial-and-error" derivation can be translated into a "direct logical reasoning", namely a resolution refutation (of the unit-clause corresponding to the forced assignment)!
    6. One should do so for example for the above "EasterMonster"-example, where [Denis Berthier] claims "no resolution rule is yet known to be applicable to this puzzle".
    7. He speaks of "rules", while the resolution proof is not a general rule; so well, perhaps some form of automated generalisation is possible?
    8. The full refutation of the EasterMonster with negation is just 9 nodes (and 100 r_2-reductions). Of course a lot of unit-propagations are involved. A good representation of this refutation would be valuable.
Finding hard instances for box-dimension >3
Tools for investigations with concrete Sudoku instances
  • cnf_puzzle
         cnf_puzzle [OPTIONS] empty puzzle
         -c FILE
         Add the clauses in FILE to the output.
         cnf_puzzle outputs a Sudoku puzzle in Dimacs-cnf-format
         according to the direct encoding. The file "empty" is a 
         cnf-file in Dimacs format representing an empty Sudoku puzzle.
         "puzzle" is a string representing a concrete instance of a
         Sudoku puzzle. The puzzle-string contains 81 digits with
         values between 0 and 9. 0 represents an empty cell. The 
         fixed fields of the puzzle are specified by non-zero values.
         If position 1 <= i <= 81 contains non-zero value k then the final Sudoku
         puzzle output has fixed field ((i-1)/9+1, (i-1)%9+1) with value k, where
         / is integer division and % is the modulus operator.
         An optional file argument allows arbitrary other clauses to
         be added to the output. (The intention being to allow the
         negation of a satisfying assignment).
    1. It would be good if optionally instead of the string "empty" the script could read from standard input.
    2. Also good if there would be an option "--strict", which only accepts the string as specified above, while otherwise space-symbols in the string are ignored, and missing values at the end are automatically added as zeros.
    3. Is "FILE" also in Dimacs-format? The point here is about the preamble, the initial comments plus the parameter-line.
    4. DONE (good enough for the beginning) Perhaps allowing arbitrary extra clauses to be appended to the output is a bad idea. Really we only want to allow precisely the negation of the (unique?) satisfying assignment?
    5. DONE (good enough for the beginning) As above, the script is not especially flexible because it has to be called once per puzzle. Nicer would be to allow multiple concrete puzzles as input but then how to control the output so that different puzzles are output to different files?
    6. Larger dimensions:
      1. Possibly the script should be able to handle puzzles of larger dimension but then a different format for the puzzle input is needed.
      2. We should have a look how at the various web-pages these larger instances are specified.

Definition in file general.hpp.