GENESIM: Generalized ENESIM
GENESIM (mps_genesim
) [HANSEN2016] is a generalized version of the ENESIM algorithm [GUARDIANO], in which the conditional distribtion computed from a finite set of conditional events.
In one extreme, the full conditional distribution is obtained by scanning the whole training image at each iteration, in which case GENESIM is identical to the ENESIM algorithm [GUARDIANO].
In another extreme, the conditional distribution is constructed from only one conditional event. In this case GENESIM acts similar to the direct sampling algorithm [MARIETHOZ2010], with the practical difference that the local conditional distribution is in fact computed, and a realization is drawn from. In the direct sampling algorithm the conditional distribution is never realized, instead a new pixel value is chosen from the first matching conditional event.
An example of a parameter file for mps_genesim
:
Number of realizations # 1
Random Seed (0 `random` seed) # 0
Maximum number of counts for conditional pdf # 1
Max number of conditional point # 25
Max number of iterations # 10000
Distance Measure (0: discrete, 1: continious), maximum distance, power # 1 0 0
ColocateDimension # 0
Maximum Search Radius # 1000000
Simulation grid size X # 18
Simulation grid size Y # 16
Simulation grid size Z # 1
Simulation grid world/origin X # 0
Simulation grid world/origin Y # 0
Simulation grid world/origin Z # 0
Simulation grid grid cell size X # 1
Simulation grid grid cell size Y # 1
Simulation grid grid cell size Z # 1
Training image file (spaces not allowed) # ti.dat
Output folder (spaces in name not allowed) # .
Shuffle Simulation Grid path (2: preferential, 1: random, 0: sequential) # 2
Shuffle Training Image path (1 : random, 0 : sequential) # 1
HardData filename (same size as the simulation grid)# conditional.dat
HardData seach radius (world units) # 1
Softdata categories (separated by ;) # 0;1
Soft datafilenames (separated by ; only need (number_categories - 1) grids) # soft.dat
Number of threads (minimum 1, maximum 8 - depend on your CPU) # 1
Debug mode(2: write to file, 1: show preview, 0: show counters, -1: no ) # -2
A description of the options that apply to all MPS algorithms can be seen here.
The following lines in the parameter files are specific to the GENESIM type algorithm:
line 3: Maximum number of counts for conditional pdf, n_max_count_cpdf
n_max_count_cpdf
defines the maximum number of counts in the
conditional distribution obtained from the training image. When
´n_max_count_cpdf´ has been reached the scanning of the training
image stops.
When n_max_count_cpdf<0
no limit on the number of counts is set.
line 4: Max number for conditional points, n_cond
n_cond
conditional data are considered at each
iteration when inferring theline 5:Max number of iterations, n_max_ite
A maximum of n_max_ite
iterations of searching through the training
image are performed.
ifn_max_ite<0
the full training image is scanned.
line 6: distance_measure, and, distance_measure
, maximum distance, distance_max
, and distance_pow
The distance_measure
used:
1: Number of matching pixels (Discrete TI)
2: Euclidean distance (Continuous TI)
The maximum distance what will lead to accepting a conditional template
match is set by distance_max. If not set, is set to distance_max=0
,
which means that a perfect match is searched for!
distance_pow=0
indicated no
weighing. A higher will favor the data value of conditional events
closer to the center value.line 6: ‘max_search_radius’
Only conditional data within a radius of ‘max_search_radius’ is used as conditioning data.
line 7:’colocate_dimension’
For a 3D TI make sure the order matters in the last dimensions (allow performing 2D co-simulation with conditional data in the third dimension)
debug mode
when debug>1
, A number of extra grids will be written to disk for
each realization. If the used training image is called ‘ti.dat’, then,
following GSLIB files contains:
ti.dat_tg1_0.gslib
: The distance between the conditional event and
the corresponding best ‘match’ in the TI .
ti.dat_tg2_0.gslib
: The number of matching counts for the
conditional pdf.
ti.dat_tg3_0.gslib
: The index in the TI, of the best matching
conditional event.
ti.dat_path_0.gslib
: Index of the path in the simulation grid.
ENESIM
The classical ENESIM algorithm can be run settingn_max_count_cpdf
and n_max_ite
to infinity (using -1):
Maximum number of counts for conditional pdf # -1
Max number of iterations # -1
In this case the full training image will be scanned at each iteration to establish a conditional probability density.
ENESIM leads to a very slow algorithm, but the full/most accurate
conditional distribtuion is computed at each iteration. This can be
usefull when performing simulation conditional to soft data. If not,
then the Direct Sampling algorithm is much more efficient
(n_max_count_cpdf=inf)
GENESIM
In case0<n_max_count_cpdf<infinity
, mps_genesim
will behave
intermediate between ENESIM and Direct Sampling.
GENESIM is useful in case the local conditional distribution is needed, as is the case when conditioning to soft data. In this case, the GENESIM may be much faster than ENESIM.
DIRECT SAMPLING
In case n_max_count_cpdf=1
, mps_genesim
will behave similar to
the direct sampling algorithm. The computational efficiency can further
be controlled using n_max_ite,
to be set a value smaller than the
number of pixels in the training image.
As the full local conditional distribution is not available (it is never computed/inferred), conditioning to soft data is done using the rejection sampler (Hansen et al. 20xx, submitted)
Temporary Grids
If the verbose level is higher than one 5 temporary grids are written do disk. In case the training image has the name ‘ti.dat’ the following grids are exported as EAS files :
ti.dat_tg1_0.gslib: The distance for the last accepted match, when scanning the training image.
ti.dat_tg2_0.gslib: The number of counts used to set up the
conditional probability density. When using Direct Sampling,
n_max_count_cpdf=1
, this value should never be higher than 1.
ti.dat_tg3_0.gslib: The index of the position in the training image for last/best match.
ti.dat_tg4_0.gslib: The number of iterations in the training image.
ti.dat_tg5_0.gslib: Used number of conditional points.