`vignettes/a02_theory_of_suit.Rmd`

`a02_theory_of_suit.Rmd`

To better interpret the results provided by the APIs, this article will present the methodology/theory used by ALUES for computing the suitability scores. In its simplest form, the task of evaluating land suitability is to map an input characteristics of the land unit into the suitability class of the target parameter or factor. This is done by checking whether the input characteristic is within any of the suitability classes. Consider for example the following data:

`## Loading required package: Rcpp`

`BANANATerrain`

```
## code s3_a s2_a s1_a s1_b s2_b s3_b wts
## 1 Slope1 6.0 4 2 <NA> <NA> <NA> <NA>
## 2 Slope2 16.0 8 4 <NA> <NA> <NA> <NA>
## 3 Slope3 30.0 16 8 <NA> <NA> <NA> <NA>
## 4 Flood 2.5 2 1 <NA> <NA> <NA> 1
## 5 Drainage 4.0 3 2 <NA> <NA> <NA> 2
## 6 SlopeD 3.0 2 1 <NA> <NA> <NA> 1
```

If an input land unit has terrain with slope of 1 degree, then
according to `BANANATerrain`

crop requirement, the land unit
is *highly suitable* (S1) for farming banana. In this example,
the suitability score is the 1 degree slope, since this is the
statistics of the land unit directly compared to the intervals of the
suitability classes (the columns: s1 - highly suitable, s2 - suitable,
s3 - marginally suitable) provided in `BANANATerrain`

.
Further, suppose the input land unit is known to flood, then input land
unit has Flood factor equal to 2 (i.e. short time according to the
metric of Flood factor), and according to `BANANATerrain`

,
the land unit is not *highly suitable* (S1) but rather
*suitable* (S2). In this case, the suitability scores of the land
unit for factors SlopeD and Flood are 1 and 2, respectively, with the
corresponding classes of S1 and S2, respectively. However, these scores
can be further summarized into a single value known as the *overall
suitability score*, albeit it won’t be straightforward. This is due
to the units or metric of the suitability scores, SlopeD is in terms of
degrees, so a score of 1, means 1 degree, whereas Flood is in terms of
time, so a score of 2, means short time. Two different metrics cannot be
combined into one, this is where the concept of membership function
comes in.

The limits of each suitability class specified for each factor in any
crop requirement, example `BANANATerrain`

, forms what is
referred in here as the *unstandardized suitability class
intervals*. The term *unstandardized* follows from the fact
that the class intervals across factors or parameters have different
units, as already emphasized earlier. It would be convenient, therefore,
to have a uniform or standardized unit or metric across factors. In this
article, this is referred to as the *standardized suitability
scores* and *standardized class intervals*. For purpose of
brevity and distinction, the *unstandardized suitability class
intervals* are now referred to as the *parameter class
intervals* or *parameter intervals*, since the former is
specified across parameters of any crop requirement.

The idea of membership function is to standardize the parameter class
intervals into a *standardized suitability class*. For purpose of
brevity, the latter is now simply referred to as the *suitability
class*. The standardization is done by mapping the parameter
intervals into a space of unit interval, i.e. \(\mathbb{R}_{[0,1]}\). More formally,
Definitions 1-3 are the mathematical formulations of the concepts used
in this article.

The membership function (MF) is used to standardized the scores and
the parameters intervals across factors. More formally, it is defined in
Definition 1 below. There are choices for the shapes of MF, for ALUES
there are three: *triangular*, *trapezoidal* and
*Gaussian*. Each of the MF can take either *partial* or
*complete* face. For triangular, refer to Definitions 4-6; for
trapezoidal, refer to Definitions 7-9; and for Gaussian, refer to
Definition 10.

Referring back to `BANANATerrain`

, the parameter intervals
for the suitability classes of SlopeD can be written explicitly as
follows: [min, 1) for S1; [1, 2) for S2; and [2, 3] for S3. This
assignment is based on the classification used by Yen et al. (2006). The
*not-suitable* (N) class is not indicated since it is understood
that values greater than the S3’s upper limit or less than the S1’s
lower limit (if exists), are assigned to class N. Given this ordering of
crop’s parameter interval limits, the appropriate MF is the right
triangular MF (Fig. 1b). This follows from the fact that the
*most-suitable* (or *highly-suitable*) class S1 has
interval limits less than the limits of other suitability classes. By
doing so, the crop’s parameter interval limits are arranged in ascending
order in the \(x\)-axis on points \(v_1\), \(v_2\) and \(v_3\), respectively, as shown in Fig.
1b.

To complete the computation, the `min`

and
`max`

limits, which are notated as \(v_0\) and \(v_p\) (in this case, \(v_p=v_4\) since \(p=4\)), respectively, must therefore be
specified. In ALUES, however, these values can be assigned by the users
themselves based on their expert opinions. Otherwise, the package will
set the \(\mathrm{min}:=v_0=0\) and
\(\mathrm{max}:=v_p:=v_{p-1}+\gamma=v_3+\gamma\)
(\(\gamma\) is defined in Definition 4)
by default. As an example (for SlopeD), the max is mathematically
computed as follows: \[\begin{align}
\gamma :=&\;\frac{(v_2-v_1)+(v_3-v_2)}{2}\nonumber\\
=&\;\frac{(2-1)+(3-2)}{2} = 1,
\end{align}\] so that \[\begin{align}
\mathrm{max}:=&\;v_p:=v_3+\gamma\nonumber\\
=&\;3+\frac{(2-1)+(3-2)}{2}=4.
\end{align}\]

This section presents the complete definitions of the theory used in
the core algorithms of the package. **Definition 1
(Membership Function)**. Let \(\mathscr{X}\subseteq \mathbb{R}\) and \(\mathscr{Y}\subseteq \mathbb{R}_{[0,1]}\),
then \(\mu:\mathscr{X}\rightarrow\mathscr{Y}\) is
a *membership function* (MF). *Remark 1*. In the
context of land evaluation, \(\mathscr{X}\) is the space of the parameter
values of the input land unit, and \(\mathscr{Y}\) is the space of the
suitability scores. **Definition 2 (Class
Intervals)**. Let \(u_i\in\mathbb{R},
\forall i\in\mathbb{N}_{[0,p-1]}\), then the partitions \([u_i,u_{i+1})\in\mathscr{U}\) are defined
as the *suitability class intervals*. **Definition
3 (Parameter Intervals)**. Let \(v_i\in\mathbb{R}, \forall
i\in\mathbb{N}_{[0,p-1]}\), then \([v_i,v_{i+1})\in \mathscr{V}\) are defined
to be the *crop’s parameter intervals*. *Remark 2*.
\(v_i\) is the interval limit of the
factor or parameter. \(v_0\) and \(v_p\) are the minimum and maximum factor
limits, respectively, both needs to be computed.

**Definition 4 (Left Triangular)**. Let \(x_{jk}\in\mathscr{X}\) be the \(j\)th land unit’s actual value for any
target factor \(k\), \(\forall j \in \mathbb{N}_{[1,n]}\) and
\(\forall k \in \mathbb{N}_{[1,m]}\),
and let \([v_{i},v_{i+1})\in\mathscr{V}\) be the
crop’s parameter intervals, \(\forall
i\in\mathbb{N}_{[0,p-1]}\), then the *lower* or *left
triangular* MF, herein notated as \(\mu_{\triangle_{\downarrow}}\), is defined
as follows: \[\begin{equation}
\mu_{\triangle_{\downarrow}}(x_{jk}):=
\begin{cases}
\displaystyle\frac{x_{jk}-\mathrm{min}}{\mathrm{max}-\mathrm{min}},&\mathrm{min}\leq
x_{jk}\leq\mathrm{max}\\
0,&\mathrm{otherwise}
\end{cases}
\end{equation}\] where \(\mathrm{min}:=
v_0:= v_1-\gamma\), \(\mathrm{max}:=
v_p:= v_{p-1}+\gamma\), and \(\gamma:=\frac{1}{p-2}\sum_{i=1}^{p-2}(v_{i+1}-v_{i})\).
*Remark 3*. ALUES sets the \(\mathrm{min}:=v_0=0\) for all MFs, unless
specified by the user explicitly.

**Definition 5 (Right Triangular)**. From Definition 4,
the *upper* or *right triangular* MF, herein notated as
\(\mu_{\triangle_{\uparrow}}\), is
defined as follows: \[\begin{equation}\label{eq:rtri}
\mu_{\triangle_{\uparrow}}(x_{jk}):=
\begin{cases}
\displaystyle\frac{\mathrm{max}-x_{jk}}{\mathrm{max}-\mathrm{min}},&\mathrm{min}\leq
x_{jk}\leq\mathrm{max}\\
0,&\mathrm{otherwise}
\end{cases}.
\end{equation}\]

**Definition 6 (Full Triangular)**. From Definition 4
and 5, the *full triangular* MF, herein notated as \(\mu_{\triangle}\), is defined as follows:
\[\begin{equation}
\mu_{\triangle}(x_{jk}):=
\begin{cases}
0,&x_{jk}\leq 0\\
\mu_{\triangle_{\downarrow}}(x_{jk}),&\mathrm{min}\leq
x_{jk}\leq\mathrm{m}\\
\mu_{\triangle_{\uparrow}}(x_{jk}),&\mathrm{m}<x_{jk}<\mathrm{max}\\
0,&x_{jk}\geq \mathrm{max}
\end{cases}
\end{equation}\] where \(\mathrm{m}:=
\frac{v_{i}^{*}+v_{i+1}^*}{2}\) such that \(v_i^*<\mathrm{m}<v_{i+1}^*\), and
\(v_i^{*}\) and \(v_{i+1}^*\) are interval limits right next
to m.

**Definition 7 (Left Trapezoidal)**. From Definition 4, the
*lower* or *left trapezoidal* MF, herein notated as \(\mu_{\bigtriangledown_{\downarrow}}\), is
defined as follows: \[\begin{equation}
\mu_{\bigtriangledown_{\downarrow}}(x_{jk}):=
\begin{cases}
\displaystyle\frac{x_{jk}-\mathrm{min}}{\mathrm{max}-\mathrm{min}},&\mathrm{min}\leq
x_{jk}\leq v_{p-1}\\
1,&v_{p-1}<x_{jk}\leq \mathrm{max}\\
0,&\mathrm{otherwise}
\end{cases},
\end{equation}\] where \(\mathrm{min},\mathrm{max}\) and \(\gamma\) are the same as in Definition
4.

**Definition 8 (Right Trapezoidal)**. From Definition 4,
the *upper* or *right trapezoidal* MF, herein notated as
\(\mu_{\bigtriangledown_{\uparrow}}\),
is defined as follows: \[\begin{equation}
\mu_{\bigtriangledown_{\uparrow}}(x_{jk}):=\begin{cases}
1,&\mathrm{min}\leq x_{jk}\leq v_1\\
\displaystyle\frac{\mathrm{max}-x_{jk}}{\mathrm{max}-\mathrm{min}},&v_1<x_{jk}\leq\mathrm{max}\\
0,&\mathrm{otherwise}
\end{cases}.
\end{equation}\]

**Definition 9 (Full Trapezoidal)**. From Definition 7
and 8, the *full trapezoidal* MF, herein notated as \(\mu_{\bigtriangledown}\), is defined as
follows: \[\begin{equation}
\mu_{\bigtriangledown}(x_{jk}):=
\begin{cases}
\mu_{\bigtriangledown_{\downarrow}}(x_{jk}),&\mathrm{min}\leq
x_{jk}\leq v_i^*\\
1,&v_i^*<x_{jk}\leq v_{i+1}^*\\
\mu_{\bigtriangledown_{\uparrow}}(x_{jk}),&v_{i+1}^*<x_{jk}\leq\mathrm{max}\\
0,&\mathrm{otherwise}
\end{cases},
\end{equation}\] where \(v_i^*\)
and \(v_{i+1}^*\) are defined in
Definition 6.

**Definition 10 (Gaussian MF)**. From Definition 4, the
*full Gaussian* MF, herein notated as \(\mu_{\curlywedge}\), is defined as follows:
\[\begin{equation}
\mu_{\curlywedge}(x_{jk}):=\exp\left[-\frac{(x_{jk}-\alpha)^2}{2\sigma^2}\right],
\end{equation}\] where \(\alpha\in(-\infty,\infty)\) and \(\sigma\in(0,\infty)\).

*Remark 4*. For partial Gaussian MF, however, the adjustment
is done using the location hyperparameter. In particular, if \(\alpha=\mathrm{min}\), then the model is
right Gaussian function. However, if \(\alpha=\mathrm{max}\), then the model is
left Gaussian function.

**Definition 11 (Overall Suitability)**. Let \(y_{jk}\in\mathscr{Y}\) be the \(j\)th land unit’s suitability score for any
target factor \(k\), \(\forall j\in \mathbb{N}_{[1,n]}\) and \(\forall k\in\mathbb{N}_{[1,m]}\); and let
\(w_{k}\in\mathbb{N}_{[1,3]}\) be the
weight of the \(k\)th factor; then,
\(\mathbf{y}_{j}:=[y_{j1},\cdots,y_{jm}]^{\text{T}}\in\mathbb{R}^m\)
is the vector suitability scores of all target factors, and \(\mathbf{w}:=[w_1,\cdots,w_m]^{\text{T}}\in\mathbb{N}^m\)
is the corresponding weights vector. The *overall suitability using
average aggregation*, herein notated as \(\bar{\mu}\), of a given land unit is
computed as follows: \[\begin{equation}\label{eq:overallsuit}
\bar{\mu}(\mathbf{y}_j|\mathbf{w}):=
\mathbf{y}_j^{\mathrm{T}}\lambda(\mathbf{w})=\sum_{\forall
k}y_{jk}*\lambda (w_k),
\end{equation}\] where \(\lambda(w_k):=
\frac{\eta-w_k}{\delta}, \eta:=\sum_{\forall k} w_k\) and \(\delta:=\sum_{\forall k}(\eta - w_k)\). For
*minimum* (notated as \(\tilde{\mu}\)) and *maximum*
(notated as \(\hat{\mu}\)) aggragation
functions, the following are the definitions: \[\begin{equation}
\tilde{\mu}(\mathbf{y}_j):=\min(\{y_{j1}, \cdots,y_{jm}\}),
\end{equation}\] and \[\begin{equation}
\hat{\mu}(\mathbf{y}_j):=\max(\{y_{j1}, \cdots,y_{jm}\}).
\end{equation}\]