Plans for the translation of Rijndael into active clauses ("SAT constraints") etc in Maxima.
More...
Go to the source code of this file.
Detailed Description
Plans for the translation of Rijndael into active clauses ("SAT constraints") etc in Maxima.
 Bug:
 Test failure with okltest_ncl_ss

okltest_ncl_ss(ncl_ss)
ASSERT: Expression " 96448 = 153048 " does not evaluate to true.
 Todo:
 Simplify AES translation

The current AES translation tries to provide a "general" translation framework, and then translate the AES using that.

If we were translating many ciphers and didn't care about the details, this would be suited, but we do care about the details.

The disadvantages of the current system are:

Variables are hard to identify and have very long names (nested namespaces etc).

Due to large variable names, the size of clausesets in memory is MUCH larger than necessary; see "output_ss_fcl_std runs out of memory
generating one round AES".

Code is hard to read due to accomodating the "framework".

Statistics functions are hard to write because many different translations are bundled together, leading to many case distinctions.

The translation is hard to reason about due to the added complexities of a general translation framework.

The current system should be replaced with a system like with the DES, see Cryptanalysis/DataEncryptionStandard/ConstraintTranslation.mac.

Variables should always be specially and specifically declared; no namespaces or anything similar.

If there are two different translations, then these should be written separately, possibly in separate files.

There should be no "equality" constraints, we just use variables where they occur.

The disadvantage of a more "custom" translation is that it requires far more thought and understanding. However, this is what we want!
 Bug:
 output_ss_fcl_std runs out of memory generating one round AES

Running the following:
oklib_load_all()$
num_rounds : 1$
num_columns : 4$
num_rows : 4$
exp : 8$
final_round_b : false$
box_tran : aes_ts_box$
seed : 1$
mc_tran : aes_mc_bidirectional$
output_ss_fcl_std(num_rounds, num_columns, num_rows, exp, final_round_b, box_tran, mc_tran)$
results in the following error: Maxima encountered a Lisp error:
Memory limit reached. Please jump to an outer pointer, quit program and enlarge the
memory limits before executing the program again.

The Maxima memory limit is 2GB.

Using "oklib_monitor : true$", to watch the progress of the function, shows us that Maxima runs out of memory when it runs standardise_fcl in output_ss_fcl_std.

Before standardise_fcl is called, Maxima uses around 300MB. This is as seen from the "top" command.

There are two issues here:

The translated AES clauseset F is far larger than it needs to be:

F has 31400 variables.

F has 1510056 literal occurrences.

If each literal was stored as a 32bit integer, then F should take up 1510056 * 4 / (1024 * 1024) = ~5.8MB in memory.

However, each literal in F is an unevaluated composition of nounified Maxima functions terms. The string representation of this function could be > 200 characters.

How does Maxima store functions terms?

Passing F to standardise_fcl seems to balloon the memory use from ~300MB to >2GB:

Redefining rename_fcl to:
rename_fcl(FF,VL) := block(local(h),
for V in map("[", FF[1], VL) do h[V[1]] : V[2],
return([create_list(i,i,1,length(FF[1])),
map(
lambda([C], (
map(lambda([L], if L > 0 then h[L] else h[L]), C))), FF[2])]))$
before running output_ss_fcl_std results in Maxima using only ~500MB and running output_fcl_v.

Does the use of Maxima's associative arrays reduce memory usage? Perhaps the conversion of hash keys to strings causes a blowup in memory?

Does the inlining of functions such as substitutetotal_cl and substitutetotal_c result in less copying and therefore less memory usage?

This shouldn't be the case as Maxima lists are passed by reference.

However, various calls to map might create new lists.

Setting the heap size to 4GB allows the computation to go through:
set_heap_size_ecl(2**32);

The full 10 round AES translations (
num_rounds : 1$
) runs out of memory when the heap memory limit is 4GB.

The 4 round AES translation has been running for 3 days and is still running.

The only real sustainable solution is to rewrite the AES translation using short specially named and thought out variables, like in the DES; see "Simplify AES translation" and Cryptanalysis/DataEncryptionStandard/ConstraintTranslation.mac.

DONE For now, the Maxima heap size, heap_size_ecl_okl in Buildsystem/Configuration/ExternalSources/maxima.mak should be increased to 4GB. This would bring it in line with the amount of main memory required in Buildsystem/ReleaseProcess/README.
 Todo:
 Handling external data

We need to offer translations of the AES which use larger (thousands of clauses) representations of the Sbox, field multiplications etc.

However, including these in the OKlibrary git repository drastically increases the size of the repository, and realistically, the repository is not the place for data.

Therefore we should create a directory ExternalSources/data and have a subdirectory "AdvancedEncryptionStandard" in which we can store files like those currently stored in ComputerAlgebra/Cryptology/Lisp/Cryptanalysis/Rijndael/data/ and with much larger representations.

We can then provide a function "ss_load_external_data()" which looks in this directory and loads all of the files found within.

Questions:

Should we be storing Maxima files as the data or CNFs which are then somehow read in?

How does this external data show up in the git repository? There should, at least, be md5sums for each of the files.

What directory structure to use? Is "ExternalSources/data/AdvancedEncryptionStandard" sufficient or should we maintain a more fine grained structure mirroring the modules in the git repository?
 Todo:
 Order of small scale matrix dimensions

Functions such as "ss_fcl" take first the number of columns and then the number of rows.

Taking the arguments in this order is inconsistent with the standard order for specifying matrix dimensions (first the number of rows, then the number of columns).

The order should be corrected and all references to the altered functions updated.
 Todo:
 Rewrite ncl_list_ss correctly

Due to the large number of cases, ncl_list_ss is currently written programmatically, using list functions to combine the results of various other statistics functions.

Ideally this should be rewritten to enumerate all the cases and list all of the formulas for the statistics in full.
 Todo:
 Remove AESspecific translation

The small scale AES translation handles the full AES, and therefore there is no need for a separate AES translation now, and the duplication of code means we constantly have to update two different versions.

Therefore, the full AES translation should be removed and the helper functions made to use the small scale translation.
 Todo:
 Complete small scale helper functions

We need generation and output functions for the small scale translation.

Some exist (see ComputerAlgebra/Cryptology/Lisp/Cryptanalysis/Rijndael/Translation.mac), but these need to be checked and some thought given to whether they cover all experimental instances we wish to generate.

For example, can we integrate "Rearranging linear components of Sbox
and MixColumns".

See also "Dimensions" in Investigations/Cryptography/AdvancedEncryptionStandard/plans/Experimentation.hpp .
 Todo:
 Standardise output files names
 Todo:
 How to represent elements of arbitrary fields as boolean variables?

In CryptoSystems/Lisp/Rijndael/SmallScaleAdvancedEncryptionStandard.mac an arbitrary field can be used for the small scale AES word field.

However, a GF(3^2) field would have 9 elements, and there is no longer a simple match between polynomial coefficients of the element and variables in the translation.

One solution is to translate the problem into a nonboolean problem, and then translate from there to a boolean problem using the standard translation.

What is the right approach?

For now, this is rather a moot point as we do not use this generalisation, but it seems wrong to have different levels of generality within the implementation and translation.
 Todo:
 Provide additional translation into CSPsolver format

A translation of the AES into a format usable by popular CSP solvers should be undertaken. This should allow comparison to the SAT translation.

Is there a generic input format (like DIMACS) for CSP problems?

Translating to something Minion can read seems reasonably simple, using constraints such as:

difference (for xor)

table or negativetable (for sbox, GF multiplication etc)
one should be able to generate a simple AES CSP problem.

The specification of the sbox and multiplication operations are then separate from their instances within the constraint set. This should allow for a variety of different translations (including using large prime implicate representations).
 Todo:
 Write docus

This todo should be split across the plans files.

Documentation should be written explaining the way the translation system works, as well as pointing out the standard functions to use.

Much of this information can be moved from "Fix translation system".

Documentation should be split into the general constraint rewriting tools, and the AES specific documentation.

For docus on the general constraint rewriting, see ComputerAlgebra/Cryptology/Lisp/Cryptanalysis/Rijndael/docus/ConstraintTemplateRewriteSystem.hpp.

For docus on the AES specific translation functions, see ComputerAlgebra/Cryptology/Lisp/Cryptanalysis/Rijndael/docus/Translations.hpp.

For the general constraint rewriting, the following needs to be done:

Extend the example with a translation to SAT.

Explain the naming conventions.

Explain the need for constraints to contain enough information to be unique.

Discuss or link to discussions on the design criteria.

For the AES translation, the following needs to be done:

List key functions with explanations:

What are the convenience functions?

What are the output functions?

What are the constraints?

What are the variables?

What are the translation functions?

List naming conventions.

Explain namespace creation conventions:

For new constraints new namespaces are constructed by using an unevaluated lambda expression to include the "parent" constraint as an argument. This means the namespace encodes the history of how the variable was created.

However, for efficiency (space) reasons, the variables are removed from the constraint. Otherwise, nesting namespaces results in a rapid growth in the size of variable "tokens".

Additional parameters need to be added to constraints, to ensure similar constraints created by the same rewrite rule are still uniquely identifiable after the removal of variables from the constraint.

What are the AES parameters?

What are the key papers?

Where is the AES implementation at the Maxima level?

How to handle different parameter settings?

What is the parameter space? Move discussion from "Investigating dimensions" in Investigations/Cryptography/AdvancedEncryptionStandard/plans/Experimentation.hpp into the docus.

Add example AES translation with statistics.

The following information on the different possible translations should also be included and linked to Investigations/Cryptography/AdvancedEncryptionStandard/plans/general.hpp:

We model a generalised AES system (see ComputerAlgebra/Cryptology/Lisp/Cryptanalysis/Rijndael/Translations.mac) which supports the following different translations:

Mix columns variants with translations using:

the boxes in the encryption direction

both encryption and decryption directions.

Box translations for the Sbox and field multiplications using:

Combining affine components

We have the standard AES which has the Sbox made up of the word (byte) level inversion of block elements, following by the application of an affine transformation.

On the other hand, the affine portion of the Sbox can be moved through the Shiftrows operation such that it can be combined with the MixColumns component, yielding a component which is entirely linear at the byte level, leaving only the nonlinear inversion and addition of the affine constant as part of the Sbox operation.

Splitting the two types of operation like this will hopefully yield Sbox translations which are more compact and more amenable to SAT solvers as the affine transformation is specifically designed to increase equation size, and make the Sbox harder to model.

Generalised small scale parameters (see Cryptology/Lisp/CryptoSystems/Rijndael/SmallScaleAdvancedEncryptionStandard.mac)

n_R: number of rows in the AES block (default 4, can be 1, 2 or 4).

n_C: number of columns in the AES block (default 4, can be 1, 2 or 4).

e: size of word field in bits in the AES (default 8, can be 1, 2 or 4 or 8)  in general we might consider arbitrary fields.

r: number of rounds (default 10, can be any positive integer).

DONE This has been started but needs to be extended significantly.
 Todo:
 Generate translation that allows multiple plaintext/ciphertext pairs

Given that a single plaintext/ciphertext pair (P,C), encrypted with AES using a key K, might not be enough, on it's own, to deduce K (as there may be some K' which performs the same mapping for this specific (P,C)), for genuine experiments and understanding of AES, translations allowing multiple plaintext/ciphertext pairs but sharing the key variables are needed.

A simple method here is to perform the translation multiple times, introducing distinct variables for each pair of plaintext/ciphertext, but using the same key variables in each translation. The union of all such translations is then the required result.
 Bug:
 DONE Bug not in the milestones

A bug must never be hidden in a testobjectfile!

Then likely no further actions are taken to remove the source of the failure.
 Bug:
 DONE (corrected statistics after updating minimum representations) Test failure

okltest_ncl_ss(ncl_ss)
ASSERT: Expression " 2652 = 6236 " does not evaluate to true.
ERROR: /home/csoliver/OKplatform/system_directories/aux/tests/maxima/OKlib/ComputerAlgebra/Cryptology/Lisp/Cryptanalysis/Rijndael/testobjects/Translations
Definition in file Translations.hpp.