--- title: "Handbook for the Stochastic Production model in Continuous Time (SPiCT)" author: "Martin W. Pedersen, Alexandros Kokkalis, Tobias K. Mildenberger, Casper W. Berg" date: "`r format(Sys.time(), '%d %B, %Y')`" output: rmarkdown::html_vignette: number_sections: yes toc: yes pdf_document: number_sections: false toc: yes keep_tex: false bibliography: bibliography.bib vignette: > %\VignetteIndexEntry{SPiCT Handbook} \usepackage[utf8]{inputenc} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console --- ```{r, echo = FALSE} knitr::opts_chunk$set(collapse = TRUE, comment = " ", fig.align = 'center', cache=FALSE) ``` # Basic functionality ## Getting started This vignette explains basic and more advanced functions of the `spict` package. The package is installed from gihtub using the `devtools` package: ```{r, eval=FALSE} devtools::install_github("DTUAqua/spict/spict") ``` installs the stable version of `spict`. When loading the package the user is greeted and the installed version is shown: ```{r} library(spict) ``` The printed version follows the format ver\@SHA, where _ver_ is the spict version number and _SHA_ is a unique github commit on [github](https://github.com/DTUAqua/spict). The content of this vignette pertains to the version printed above that can be found [here](`r paste0("https://github.com/DTUAqua/spict/tree/v", packageVersion("spict"))`). A specific version of `spict` can be installed using the following: ``` devtools::install_github("DTUAqua/spict/spict", ref = "v`r packageVersion("spict")`") ``` ## Loading built-in example data The package contains the catch and index data analysed in @polacheck1993 that can be loaded using ```{r} data(pol) ``` Data on three stocks are contained in this dataset: South Atlantic albacore, northern Namibian hake, and New Zealand rock lobster. Here focus will be on the South Atlantic albacore data. This dataset contains the following ```{r} pol$albacore ``` Note that data are structured as a list containing the entries `obsC` (catch observations), `timeC` (time of catch observations), `obsI` (index observations), and `timeI` (time of index observations). If times are not specified it is assumed that the first observation is observed at time 1 and then sequentially onward with a time step of one year. It is therefore recommended to always specify observation times. Each catch observation relates to a time interval. This is specified using `dtc`. If `dtc` is left unspecified (as is the case here) each catch observation is assumed to cover the time interval until the next catch observation. For this example with annual catches `dtc` therefore is ```{r} inp <- check.inp(pol$albacore) inp$dtc ``` It is important to specify `dtc` if the default assumption is not fulfilled. Note that the index observations for all data sets analysed in @polacheck1993 represent commercial CPUE data and the respective times (`inp$timeI`) should thus correspond to the middle of the year, e.g. 1988.50, rather than the beginning of the year, e.g. 1988.00. The 'incorrect' times of the commercial CPUE observations were kept only for reasons of consistency regarding previous studies using models which were not implemented in 'continuous time'. We emphasise the importance of correct times of the index observations, which should most accurately correspond to the timing of the survey or to the middle of the year in case of commercial CPUE. ## Plotting data \label{pldat} The data can be plotted using the command ```{r, fig.width=5, fig.height=5.5, out.width='0.5\\textwidth', fig.show='hold'} plotspict.data(pol$albacore) ``` Note that the number of catch and index observations are given in the respective plot headers. Furthermore, the color of individual points shows when the observation was made and the corresponding colors are shown in the color legend in the top right corner. For illustrative purposes let's try shifting the biomass index data a bit ```{r, fig.width=5, fig.height=5.5, out.width='0.5\\textwidth', fig.show='hold'} inpshift <- pol$albacore inpshift$timeC <- inpshift$timeC inpshift$timeI <- inpshift$timeI + 0.8 plotspict.data(inpshift) ``` Now the colours show that the index takes place in autumn. ## Advanced data plotting \label{advpldat} There is also a more advanced function for plotting data, which at the same time does some basic model fitting (linear regression) and shows the results ```{r, results='show', message=FALSE, warning=FALSE, fig.width=6.5, fig.height=5.5, fig.show='hold'} plotspict.ci(pol$albacore) ``` The two top plots come from `plotspict.data`, with the dashed horizontal line representing a guess of MSY. This guess comes from a linear regression between the index and the catch divided by the index (middle row, left). This regression is expected to have a negative slope. A similar plot can be made showing catch versus catch/index (middle row, right) to approximately find the optimal effort (or effort proxy). The proportional increase in the index as a function of catch (bottom row, right) should show primarily positive increases in index at low catches and vice versa. Positive increases in index at large catches could indicate model violations. In the current plot these are not seen. ## Fitting the model The model is fitted to data by running ```{r, results='show', message=FALSE, warning=FALSE, fig.width=6.5, fig.height=5.5, fig.show='hold'} res <- fit.spict(pol$albacore) ``` Here the call to `fit.spict` is wrapped in the `system.time` command to check the time spent on the calculations. This is obviously not required, but done here to show that fitting the model only takes a few seconds. The result of the model fit is stored in `res`, which can either be plotted using `plot` or summarised using `summary`. The results are returned as a list that contains output as well as input. The content of this list is ```{r, results='show', message=FALSE, warning=FALSE, fig.width=6.5, fig.height=5.5, fig.show='hold'} names(res) ``` Many of these variables are generated by `TMB::sdreport()`. In addition to these `spict` includes the list of input values (`inp`), the object used for fitting (`obj`), the result from the optimiser (`opt`), the time spent on fitting the model (`computing.time`), and more less useful variables. ## Interpreting summary of results The results are summarised using ```{r, eval=FALSE} summary(res) ``` ```{r, echo=FALSE,results='show', message=FALSE, warning=FALSE, fig.width=6.5, fig.height=5.5, fig.show='hold'} capture.output(summary(res)) ``` Here is a line-by-line description of the summary: - Line 1: Convergence of the model fit, which has code 0 if the fit was succesful. If this is not the case convergence was not obtained and reported results should not be used. In case of non-convergence results will still be reported to aid diagnosis of the problem. - Line 2: Objective function value at the optimum. The objective function is the likelihood function if priors are not used and the posterior density function if priors are used. - Line 3: The Euler time step used in the calculation. - Line 4: Number of observations for the time series used. - Line 6-9: Summary of the priors used in the fit. The priors shown here are the default priors that are applied when priors are unspecified. These are relatively uninformative and are applied because most data-limited situations do not allow simultaneous estimation of all noise parameters and `logn`. The default priors can be disabled (see the section on priors). - Line 11-25: Summary of the parameter estimates and their 95% CIs. These can be extracted as a data frame with `sumspict.parest(res)`. - Line 27-31: Estimates of deterministic reference points with 95% CIs. These are the reference points one would derive if stochasticity were ignored. Can be extracted with `sumspict.drefpoints(res)`. - Line 32-36: Estimates of stochastic reference points with 95% CIs. These are the reference points of the stochastic model. The column ´rel.diff.Drp´ shows the relative difference when compared to the deterministic reference points. The information can be extracted with `sumspict.srefpoints(res)`. - Line 38-43: State estimates in the final year where data were available. The states of the model are biomass (`B`) and fishing mortality (`F`) with the year of the estimates appended. The year is shown as a decimal number as estimates within year are possible. Both absolute (`B` and `F`) and relative estimates (`B/Bmsy` and `F/Fmsy`) are shown. The relative estimates are calculated using the type of reference points given by `msytype` (line 38), where `s` is stochastic and `d` is deterministic. Here `msytype` is ´s´. This information can be extracted using `sumspict.states(res)`. - Line 45-52: Predictions of absolute and relative biomass and fishing mortality at the time indicated by `inp$timepredi`, here 1990 (line 47-50). In addition, predicted catch at the time indicated by `inp$timepredc` (line 51). Finally, the equilibrium biomass, indicated by E(B_inf), if current conditions remain constant. There predictions or forecasts are calculated under the fishing scenario given by `inp$ffac`. See the section on forecasting for more information. The prediction summary can be extracted using `sumspict.predictions(res)`. ## Interpreting plots of results `spict` comes with several plotting abilities. The basic plotting of the results is done using the generic function `plot` that produces a multipanel plot with the most important outputs. ```{r, results='show', message=FALSE, warning=FALSE, fig.width=6.5, fig.height=5.5, fig.show='hold'} plot(res) ``` Some general comments can be made regarding the style and colours of these plots: - Estimates (biomass, fishing mortality, catch, production) are shown using blue lines. - 95% CIs of absolute quantities are shown using dashed blue lines. - 95% CIs of relative biomass and fishing mortality are shown using shaded blue regions. - Estimates of reference points ($B_{MSY}$, $F_{MSY}$, $MSY$) are shown using black lines. - 95% CIs of reference points are shown using grey shaded regions. - The end of the data range is shown using a vertical grey line. - Predictions beyond the data range are shown using dotted blue lines. - Data are shown using points colored by season. Different index series use different point characters (not shown here). The individual plots can be plotted separately using the `plotspict.*` family of plotting functions; all functions are summarised in Table 1 and their common arguments that control their look in Table 2: ```{r, echo = FALSE} knitr::kable(data.frame(## ls(pattern = "plotspict.*", envir = asNamespace("spict")), Function = c("**Data**", "`plotspict.ci`", "`plotspict.data`", "**Estimates**", "`plotspict.bbmsy`", "`plotspict.biomass`", "`plotspict.btrend`", "`plotspict.catch`", "`plotspict.f`", "`plotspict.fb`", "`plotspict.ffmsy`", "`plotspict.priors`", "`plotspict.production`", "`plotspict.season`", "**Diagnostics & extras**", "`plotspict.diagnostic`", "`plotspict.osar`", "`plotspict.likprof`", "`plotspict.retro`", "`plotspict.retro.fixed`", "`plotspict.infl`", "`plotspict.inflsum`", "`plotspict.tc`"), Plot = c("", "Basic data plotting (see section \\ref{pldat})", "Advanced data plotting (see section \\ref{advpldat})", "", "Relative biomass $B/B_{MSY}$ estimates with uncertainty", "Absolute (and relative) biomass estimates with uncertainty", "Expected biomass trend", "Catch data and estimates", "Absolute (and relative) fishing mortality $F$", "Kobe plot of relative fishing mortality over biomass estimates", "Relative fishing mortality $F/F_{MSY}$", "Prior-posterior distribution of all parameters that are estimated using priors", "Production over $B/K$", "Seasonal pattern of fishing mortality $F$", "", "OSA residual analysis to evaluate the fit", "One-step-ahead residual plots, one for data time-series", "Profile likelihood of one or two parameters", "Retrospective analysis", "Retrospective analysis of fixed effects", "Influence statistics of observations", "Summary of influence of observations", "Time to $B_{MSY}$ under different scenarios about $F$" )), caption = "Available plotting functions.") ``` Argument Value Result -------- ----- ------ `logax` logical If `TRUE`, the y-axis is in log scale `main` string The title of the plot `ylim` numeric vector The limits of the y-axis `plot.obs` logical If `TRUE` (default) the observations are shown `qlegend` logical If `TRUE` (default) the color legend is shown `xlab`, `ylab` string The x and y axes labels `stamp` string Adds a "stamp" at the bottom right corner of the plotting area Default is the version and SHA hash of `spict`. An empty string removes the stamp. Table: Common arguments in the `plotspict.*` family of funtions We will now look at them one at a time. The top left is the plot of absolute biomass ```{r plotspict.biomass, results='show', message=FALSE, warning=FALSE, fig.width=4, fig.height=4, fig.show='hold'} plotspict.biomass(res) ``` Note that this plot has a y-axis on the right side related to the relative biomass ($B_t/B_{MSY}$). The shaded 95% CI region relates to this axis, while the dashed blue lines relate to the left y-axis indicating absolute levels. The dashed lines and the shaded region are shown on the same plot to make it easier to assess whether the relative or absolute levels are most accurately estimated. Here, the absolute are more accurate than the relative. Later, we will see examples of the opposite. The horizontal black line is the estimate of $B_{MSY}$ with 95% CI shown as a grey region. The plot of the relative biomass is produced using ```{r plotspict.bbmsy, results='show', message=FALSE, warning=FALSE, fig.width=4, fig.height=4, fig.show='hold'} plotspict.bbmsy(res) ``` This plot contains much of the same information as given by `plotspict.biomass`, but without the information about absolute biomass and without the 95% CI around the $B_{MSY}$ reference point. The plots of fishing mortality follow the same principles ```{r plotspict.f, results='show', message=FALSE, warning=FALSE, fig.width=3, fig.height=3.3, fig.show='hold'} plotspict.f(res, main='', qlegend=FALSE, rel.axes=FALSE, rel.ci=FALSE) plotspict.ffmsy(res, main='', qlegend=FALSE) ``` The estimate of $F_{MSY}$ is shown with a horizontal black line with 95% CI shown as a grey region (left plot). The 95% CI of $F_{MSY}$ is very wide in this case. As shown here it is quite straightforward to remove the information about relative levels from the plot of absolute fishing mortality. Furthermore, the argument `main=''` removes the heading and `qlegend=FALSE` removes the colour legend for data points. The plot of the catch is produced using ```{r plotspict.catch, results='show', message=FALSE, warning=FALSE, fig.width=4, fig.height=4, fig.show='hold'} plotspict.catch(res) ``` This plot shows estimated catches (blue line) versus observed catches (points) with the estimate of $MSY$ plotted as a horizontal black line with its 95% CI given by the grey region. A phase plot (or kobe plot) of fishing mortality versus biomass is plotted using ```{r plotspict.fb, results='show', message=FALSE, warning=FALSE, fig.width=4, fig.height=4, fig.show='hold'} plotspict.fb(res, ylim=c(0, 1.3), xlim=c(0, 300)) ``` The plot shows the development of biomass and fishing mortality since the initial year (here 1967) indicated with a circle until the terminal year (here 1990) indicated with a square. The yellow diamond indicates the mean biomass over a long period if the current (1990) fishing pressure remains. This point can be interpreted as the fished equilibrium and is denoted $E(B_\infty)$ in the legend as a statistical way of expressing the expectation of the biomass as $t \rightarrow \infty$. As the current fishing mortality is close to $F_{MSY}$ the expected long term biomass is close to $B_{MSY}$. A vertical dashed red line at $B_t = 0$ indicates the biomass level below which the stock has crashed. The grey shaded banana-shaped area indicates the 95\% confidence region of the pair $F_{MSY}$, $B_{MSY}$. This region is important to visualise jointly as the two reference points are highly (negatively) correlated. ## Residuals and diagnostics Before proceeding with the results for an actual assessment it is very important that the model residuals are checked and possible model deficiencies identified. Residuals can be calculated using `calc.osa.resid()`. OSA stands for one-step-ahead, which are the proper residuals for state-space models. More information about OSA residuals is contained in @spict. To calculate and plot residuals and diagnostics do ```{r plotspict.diagnostic, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=7, fig.show='hold'} res <- calc.osa.resid(res) plotspict.diagnostic(res) ``` The first column of the plot contains information related to catch data and the second column contains information related to the index data. The rows contain 1. Log of the input data series. 2. OSA residuals with the p-value of a test for bias (i.e. that the mean of the residuals is different from zero) in the plot header. If the header is green the test was not significant, otherwise the header would be red. 3. Empirical autocorrelation of the residuals. Two tests for significant autocorrelation is performed. A Ljung-Box simultaneous test of multiple lags (here 4) with p-value shown in the header, and tests for individual lags shown by dashed horizontal lines in the plot. Here no violation is identified. 4. Tests for normality of the residuals both as a QQ-plot and with a Shapiro test with p-value shown in the plot header. This data did not have any significant violations of the assumptions, which increases confidence in the results. For a discussion of possible violations and remedies the reader is referred to @spict. ### Process residuals The function `calc.process.resid()` calculates and adds the process residuals as a `data.frame`. It takes the fitted object and `dt` as arguments. `dt` has by default the same resolution as the input data, i.e. `1/inp$nseasons`. Function `plotspict.diagnostic.process()` plots the process residuals for the the biomass and fishing mortality processes. The plot is similar to that produced by `plotspict.diagnostic`, see above for the description. ```{r plotspict.diagnostic.process, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=7, fig.show='hold'} res <- calc.process.resid(res) plotspict.diagnostic.process(res) ``` ## Extracting parameter estimates To extract an estimated quantity, here `logBmsy` use ```{r get.par, results='show', message=FALSE, warning=FALSE, fig.width=6.5, fig.height=5.5, fig.show='hold'} get.par('logBmsy', res) ``` This returns a vector with `ll` being the lower 95% limit of the CI, `est` being the estimated value, `ul` being the upper 95% limit of the CI, `sd` being the standard deviation of the estimate, and `cv` being the coefficient of variation of the estimate. The estimated quantity can also be returned on the natural scale (as opposed to log scale) by running ```{r get.par2, results='show', message=FALSE, warning=FALSE, fig.width=6.5, fig.height=5.5, fig.show='hold'} get.par('logBmsy', res, exp=TRUE) ``` This essentially takes the exponential of `ll`, `est` and `ul` of the values in log, while `sd` is unchanged as it is the standard deviation of the quantity on the scale that it is estimated (here log). When transforming using `exp=TRUE` the $CV = \sqrt{e^{\sigma^2}-1}$. Most parameters are log-transformed under estimation and should therefore be extracted using `exp=TRUE`. For a standard fit (not using robust observation error, seasonality etc.), the quantities that can be extracted using this method are ```{r list.quantities, results='show', message=FALSE, warning=FALSE, fig.width=6.5, fig.height=5.5, fig.show='hold'} list.quantities(res) ``` These should be relatively self-explanatory when knowing that reference points ending with `s` are stochastic and those ending with `d` are deterministic, quantities ending with `p` are predictions and quantities ending with `l` are estimates in the final year. If a quantity is available both on natural and log scale, it is preferred to transform the quantity from log as most quantities are estimated on the log scale. ### Extracting correlation between parameter estimates The covariance between the model parameters (fixed effects) can be extracted from the results list ```{r cov.fixed, results='show', message=FALSE, warning=FALSE} res$cov.fixed ``` It is however easier to interpret the correlation rather than covariance. The correlation matrix can be calculated using ```{r cor.fixed, results='show', message=FALSE, warning=FALSE} cov2cor(res$cov.fixed) ``` For this data most parameters are well separated, i.e. relatively low correlation, perhaps with the exception of `logm` and `logn`, which have a correlation of $-0.9$. Note that `logr` is absent from the covariance matrix. This is because the model is parameterised in terms of `logm`, `logK`, and `logn` from which `logr` can be derived. The estimate of `logr` is reported using TMB's `sdreport()` function and can be extracted using `get.par()`. The covariance between random effects (biomass and fishing mortality) is not reported automatically, but can be obtained by setting `inp$getJointPrecision` to `TRUE` (this entails longer computation time and memory requirement). The covariance between sdreported values (i.e. the values reported in `res$value`) are given in `res$cov`. As this matrix is typically large, the function `get.cov()` can be used to extract the covariance between two scalar quantities ```{r cor.bf, results='show', message=FALSE, warning=FALSE, fig.width=4, fig.height=4, fig.show='hold'} cov2cor(get.cov(res, 'logBmsy', 'logFmsy')) ``` This reveals that for this data set the estimates of log Fmsy and log Bmsy are highly correlated. This is often the case and the reason why the model is reparameterised. ## Comparison plots The function `plotspict.compare()` can be used to make comparison plots of one or more spict runs. The function is able to show biomass and fishing mortality (absolute and relative), catch, and production curve. You can use the `varname` argument to select which of those plots to produce, default is all of them (`c("B", "F", "C", "BBmsy", "FFmsy", "P")`). As an example we show below the comparison of two runs with and without logbkfrac prior. ```{r plotspict.compare, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=7, fig.show='hold' } data(pol) inp <- pol$hake rep1 <- fit.spict(inp) inp$priors$logbkfrac <- c(log(0.3), 0.1, 1) rep2 <- fit.spict(inp) plotspict.compare(list(def = rep1, bkprior = rep2)) ``` # Advanced functionality ## Retrospective plots Retrospective plots are sometimes used to evaluate the robustness of the model fit to the introduction of new data, i.e. to check whether the fit changes substantially when new data becomes available. Such calculations and plotting thereof can be performed using `retro()` as shown here ```{r plotspict.retro, results='show', message=FALSE, warning=FALSE, fig.width=7, fig.height=5, fig.show='hold'} ## res <- fit.spict(pol$albacore) res <- retro(res, nretroyear = 4, mc.cores = 1) plotspict.retro(res) ``` By default `retro` creates 5 scenarios with catch and index time series which are shortened by the 1 to 5 last observations. The number of scenarios and thus observations which are removed can be changed with the argument `nretroyear` in the function `retro`. The graphs show the different peels with different colors. For the relative biomass and fishing mortality, the Mohn's rho are shown inside the corresponding plots. Mohn's rho for other quantities can be calculated using the function `mohns_rho`. ```{r mohns_rho, results='show', message=FALSE, warning=FALSE} ## res <- fit.spict(pol$albacore) ## res <- retro(res) mohns_rho(res, what = c("FFmsy", "BBmsy")) ``` The function `plotspict.retro.fixed` makes a plot of the paramter estimates, with corresponding 95\% confidence intervals for each of the retrospective runs. ```{r plotspict.retro.fixed, results='show', message=FALSE, warning=FALSE} ## res <- fit.spict(pol$albacore) ## res <- retro(res) plotspict.retro.fixed(res) ``` ## Estimation using two or more biomass indices The estimation can be done using more than one biomass index, for example when scientific surveys are performed more than once every year or when there are both commercial and survey CPUE time-series available. The following example emulates a situation where a long but noisy first quarter index series and a shorter and less noisy second quarter index series are available with different catchabilities ```{r, results='show', message=FALSE, warning=FALSE, fig.width=6, fig.height=5, fig.show='hold'} set.seed(123) inp <- list(timeC=pol$albacore$timeC, obsC=pol$albacore$obsC) inp$timeI <- list(pol$albacore$timeI, pol$albacore$timeI[10:23]+0.25) inp$obsI <- list() inp$obsI[[1]] <- pol$albacore$obsI * exp(rnorm(23, sd=0.1)) # Index 1 inp$obsI[[2]] <- 10*pol$albacore$obsI[10:23] # Index 2 res <- fit.spict(inp) sumspict.parest(res) plotspict.biomass(res) ``` The model estimates seperate observation noises and finds that the first index (`sdi1`) is more noisy than the second (`sdi2`). It is furthermore estimated that the catchabilities are different by a factor 10 (`q1` versus `q2`). The biomass plot shows both indices, with circles indicating the first index and squares indicating the second index (the two series can also be distinguished by their colours). It is possible to force the model to assume that two or more index time-series have the same level of observation noise (CV). For example, to assume that `sdi1` equals `sdi2` one must add `inp$mapsdi <- c(1,1)` before calling `fit.spict(inp)`. The length of `mapsdi` should equal the number of indices. In case of 3 index series one could for example use `inp$mapsdi <- c(1, 1, 2)` to have series 1 and 2 share sdi and have a separate sdi for series 3. ## Using effort data instead of commercial CPUE It is possible to use effort data directly in the model instead of calculating commercial CPUE and inputting this as an index. It is beyond the scope of this vignette to discuss all problems associated with indices based on commercial CPUEs, however it is intuitively clear that using the same information twice (catch as catch and catch in catch/effort) induces a correlation, which the model does not account for. These problems are easily avoided by putting catch and effort seperately ```{r, results='show', message=FALSE, warning=FALSE, fig.width=6, fig.height=5, fig.show='hold'} inpeff <- list(timeC=pol$albacore$timeC, obsC=pol$albacore$obsC, timeE=pol$albacore$timeC, obsE=pol$albacore$obsC/pol$albacore$obsI) repeff <- fit.spict(inpeff) sumspict.parest(repeff) par(mfrow=c(2, 2)) plotspict.bbmsy(repeff) plotspict.ffmsy(repeff, qlegend=FALSE) plotspict.catch(repeff, qlegend=FALSE) plotspict.fb(repeff) ``` Here the model runs without an index of biomass and instead uses effort as an index of fishing mortality Note that index observations are missing from the biomass plot, but effort observations are present in the plot of fishing mortality. Note also that `q` is missing from the summary of parameter estimates and instead `qf` is present, which is the commercial catchability. Overall for this data set the results in terms of stock status etc. do not change much, and this will probably often be the case, however using effort data directly instead of commercial CPUE is cleaner and avoids inputting the same data twice. ## Scaling the uncertainty of individual data points It is not always appropriate to assume that the observation noise of a data series is constant in time. Knowledge that certain data points are more uncertain than others can be implemented using `stdevfacC`, `stdevfacI`, and `stdevfacE`, which are vectors containing factors that are multiplied onto the standard deviation of the data points of the corresponding observation vectors. An example where the first 10 years of the biomass index are considered uncertain relative to the remaining time series and therefore are scaled by a factor 5. ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=6.5, fig.show='hold'} inp <- pol$albacore res1 <- fit.spict(inp) inp$stdevfacC <- rep(1, length(inp$obsC)) inp$stdevfacC[1:10] <- 5 res2 <- fit.spict(inp) par(mfrow=c(2, 1)) plotspict.catch(res1, main='No scaling') plotspict.catch(res2, main='With scaling', qlegend=FALSE) ``` From the plot it is noted that the scaling factor widens the 95% CIs of the initial ten years of catch data, while narrowing the 95% CIs of the remaining years. ## Simulating data The package has built-in functionality for simulating data, which is useful for testing. ### Annual data Data are simulated using an input list, e.g. `inp`, containing parameter values specified in `inp$ini`. To simulate data using default parameters run ```{r, results='show', message=FALSE, warning=FALSE, fig.width=6.5, fig.height=5.5, fig.show='hold'} inp <- check.inp(pol$albacore) sim <- sim.spict(inp) plotspict.data(sim) ``` This will generate catch and index data of same length as the input catch and index time series (here 23 of each) at the time points of the input data. Note when plotting simulated data, the true biomass and fishing mortality are also included in the plot. Another simple example is ```{r, results='show', message=FALSE, warning=FALSE, fig.width=6.5, fig.height=5.5, fig.show='hold'} inp <- list(ini=list(logK=log(100), logm=log(10), logq=log(1))) sim <- sim.spict(inp, nobs=50) plotspict.data(sim) ``` Here the required parameters are specified (the rest use default values), and the number of observations is specified as an argument to `sim.spict()`. A more customised example including model fitting is ```{r, results='show', message=FALSE, warning=FALSE, fig.width=6.5, fig.height=5.5, fig.show='hold'} set.seed(31415926) inp <- list(ini=list(logK=log(100), logm=log(10), logq=log(1), logbkfrac=log(1), logF0=log(0.3), logsdc=log(0.1), logsdf=log(0.3))) sim <- sim.spict(inp, nobs=30) res <- fit.spict(sim) sumspict.parest(res) par(mfrow=c(2, 2)) plotspict.biomass(res) plotspict.f(res, qlegend=FALSE) plotspict.catch(res, qlegend=FALSE) plotspict.fb(res) ``` Here the ratio between biomass in the initial year relative to `K` is set using `logbkfrac`, the initial fishing mortality is set using `logF0`, process noise of `F` is set using `logsdf`, and finally observation noise on catches is specified using `logsdc`. When printing the summary of the parameter estimates the true values are included as well as a check whether the true value was inside the 95% CIs. Similarly, the true biomass, fishing mortality, and reference points are included in the results plot using a yellow/orange colour. ### Seasonal data \label{sec:seas} It is possible to simulate seasonal data (most often quarterly). Additional variables must be specified in the input list that define the type of seasonality to be used. Spline based seasonality is shown first (`inp$seasontype = 1`). This is the default and therefore does not need to be explicitly specified. It is required that number of seasons is specified using `nseasons` (4 indicates quarterly), the order of the spline must be specified using `splineorder` (3 for quarterly data), time vectors for catch and index containing subannual time points must be specified, and finally the spline parameters (`logphi`) must be set. With four seasons `logphi` must be a vector of length 3, where each value in the vector gives the log fishing intensity relative to level in season four, which is `log(1)`. An example of simulating seasonal data using a spline is ```{r, results='show', message=FALSE, warning=FALSE, fig.width=6, fig.height=5, fig.show='hold'} set.seed(1234) inp <- list(nseasons=4, splineorder=3) inp$timeC <- seq(0, 30-1/inp$nseasons, by=1/inp$nseasons) inp$timeI <- seq(0, 30-1/inp$nseasons, by=1/inp$nseasons) inp$ini <- list(logK=log(100), logm=log(20), logq=log(1), logbkfrac=log(1), logsdf=log(0.4), logF0=log(0.5), logphi=log(c(0.05, 0.1, 1.8))) seasonsim <- sim.spict(inp) plotspict.data(seasonsim) ``` The data plot shows clear seasonality in the catches. To simulate seasonal data using the coupled SDE approach `seasontype` must be set to 2 and `nseasons` to 4. ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=4.5, fig.show='hold'} set.seed(432) inp <- list(nseasons=4, seasontype=2) inp$timeC <- seq(0, 30-1/inp$nseasons, by=1/inp$nseasons) inp$timeI <- seq(0, 30-1/inp$nseasons, by=1/inp$nseasons) inp$ini <- list(logK=log(100), logm=log(20), logq=log(1), logbkfrac=log(1), logsdf=log(0.4), logF0=log(0.5)) seasonsim2 <- sim.spict(inp) plotspict.data(seasonsim2) ``` ## Estimation using quarterly data Catch information available in sub-annual aggregations, e.g. quarterly catch, can be used to estimate the seasonal pattern of the fishing mortality. The user can choose between two types of seasonality by setting `seasontype` to 1, 2 or 3: 1. using cyclic B-splines. 2. using coupled stochastic differential equations (SDEs). 3. combination of a cyclic spline (fixed seasonal pattern) and mean-zero autoregressive process Technical description of the first two season types is found in @spict. The third is described by the following equation: \begin{align} F_t &= S_t G_t \exp( A_{q(t)} ) \\ d\log G_t &= \sigma_FdV_t \end{align} \begin{itemize} \item $S_t$ is a cyclic spline and $G_t$ is diffusion process. \item $A_{q(t)} = \varphi_A A_{q(t-1)} + \varepsilon_{A,q(t)}$ is a discrete time mean zero autoregressive process , and $q$ maps $t$ to a quarter. \end{itemize} Here, an example of a spline-based model fitted to quarterly data simulated in section \ref{sec:seas} is shown ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=4.5, fig.show='hold'} seasonres <- fit.spict(seasonsim) plotspict.biomass(seasonres) plotspict.f(seasonres, qlegend=FALSE) plotspict.season(seasonres) ``` The model is able to estimate the seasonal variation in fishing mortality as seen both in the plot of `F` and in the plot of the estimated spline, where blue is the estimated spline, orange is the true spline, and green is the spline if time were truly continuous (it is discretised with the Euler steps shown by the blue line). To fit the coupled SDE model run ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=4.5, fig.show='hold'} seasonres2 <- fit.spict(seasonsim2) sumspict.parest(seasonres2) plotspict.biomass(seasonres2) plotspict.f(seasonres2, qlegend=FALSE) ``` Two parameters related to the coupled SDEs are estimated (`sdu` and `lambda`) as evident from the summary of estimated parameters. In the plot of fishing mortality it is noted that the amplitude of the seasonal pattern varies over time. This is a property of the coupled SDE model, which is not possible to obtain with the spline based seasonal model. The spline based model has a fixed amplitude and phases, which will lead to biased estimates and autocorrelation in residuals if in reality the seasonal pattern shifts a bit. This is illustrated by fitting a spline based model to data generated with a coupled SDE model ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=7, fig.show='hold'} inp2 <- list(obsC=seasonsim2$obsC, obsI=seasonsim2$obsI, timeC=seasonsim2$timeC, timeI=seasonsim2$timeI, seasontype=1, true=seasonsim2$true) rep2 <- fit.spict(inp2) rep2 <- calc.osa.resid(rep2) plotspict.diagnostic(rep2) ``` From the diagnostics it is clear that autocorrelation is present in the catch residuals. ## Setting initial parameter values Initial parameter values used as starting guess of the optimiser can be set using `inp$ini`. For example, to specify the initial value of `logK` set ```{r, results='show', message=FALSE, warning=FALSE, fig.width=4, fig.height=4, fig.show='hold'} inp <- pol$albacore inp$ini$logK <- log(100) ``` This procedure generalises to all other model parameters. If initial values are not specified they are set to default values. To see the default initial value of a parameter, here `logK`, run ```{r, results='show', message=FALSE, warning=FALSE, fig.width=4, fig.height=4, fig.show='hold'} inp <- check.inp(pol$albacore) inp$ini$logK ``` This can also be done posterior to fitting the model by printing `res$inp$ini$logK`. ### Checking robustness to initial parameter values It is prudent to check that the same parameter estimates are obtained if using different initial values. If the optimum of the objective function is poorly defined, i.e. possibly containing multiple optima, it is possible that different parameter estimates will be returned depending on the initial values. To check whether this is the case run ```{r, results='show', message=FALSE, warning=FALSE, fig.width=4, fig.height=4, fig.show='hold'} set.seed(123) check.ini(pol$albacore, ntrials=4) ``` The argument `ntrials` set the number of different initial values to test for. To keep it simple only few trials are generated here, however for real data cases more should be used, say 30. The `propchng` contains the proportional change of the new randomly generated initial value relative to the base initial value, `inimat` contains the new randomly generated initial values, and `resmat` contains the resulting parameter estimates and a distance from the estimated parameter vector to the base parameter vector. The distance should preferably be close to zero. If that is not the case further investigation is required, i.e. inspection of objective function values, differences in results and residual diagnostics etc. should be performed. The example shown here looks fine in that all converged runs return the same parameter estimates. One trial did not converge, however non-converging trials are to some extent expected as the initial parameters are generated independently from a wide uniform distribution and may thus by chance be very inappropriately chosen. ## Phases and how to fix parameters The package has the ability to estimate parameters in phases. Users familiar with AD model builder will know that this means that some parameters are held constant in phase 1, some are then released and estimated in phase 2, more are released in phase 3 etc. until all parameters are estimated. Per default all parameters are estimated in phase 1. As an example the standard deviation on the biomass process, `logsdb`, is estimated in phase 2: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=4, fig.height=4, fig.show='hold'} inp <- pol$albacore inp$phases$logsdb <- 2 res <- fit.spict(inp) ``` Phases can also be used to fix parameters to their initial value by setting the phase to `-1`. For example ```{r, results='show', message=FALSE, warning=FALSE, fig.width=4, fig.height=4, fig.show='hold'} inp <- pol$albacore inp$phases$logsdb <- -1 inp$ini$logsdb <- log(0.1) res <- fit.spict(inp) summary(res) ``` ## Priors SPiCT is a generalisation of previous surplus production models in the sense that stochastic noise is included in both observation and state processes of both fishing and biomass. Estimating all model parameters is only possible if data contain sufficient information, which may not be the case for short time series or time series with limited contrast. The basic data requirements of the model are limited to only catch and biomass index time series. More information may be available, which can be used to improve the model fit. This is particularly advantageous if the model is not able to converge with only catch and index time series. Additional information can then be included in the fit via prior distributions for model parameters. ### Default priors and how to disable them Quantities that are traditionally difficult to estimate are `logn`, and the noise ratios `logalpha` and `logbeta` where `logalpha = logsdi - logsdb` and `logbeta = logsdc - logsdf`, respectively. Therefore, to generally stabilise estimation default semi-informative priors are imposed on these quantities that inhibit them from taking extreme and unrealistic values. If informative data are available these priors should have limited effect on results, if informative data are not available estimates will reduce to the priors. If informative data are available and the default priors therefore are unwanted they can be disabled using ```{r, results='show', message=FALSE, warning=FALSE, fig.width=4, fig.height=4, fig.show='hold'} inp <- pol$albacore inp$priors$logn <- c(1, 1, 0) inp$priors$logalpha <- c(1, 1, 0) inp$priors$logbeta <- c(1, 1, 0) fit.spict(inp) ``` The model is able to converge without priors, however the estimates of `alpha`, `beta` and `n` are very uncertain indicating that limited information is available about these parameters. ### Setting a prior The model parameters to which priors can be applied can be listed using ```{r, results='show', message=FALSE, warning=FALSE, fig.width=4, fig.height=4, fig.show='hold'} list.possible.priors() ``` A prior is set using ```{r, results='show', message=FALSE, warning=FALSE, fig.width=4, fig.height=4, fig.show='hold'} inp <- pol$albacore inp$priors$logK <- c(log(300), 2, 1) fit.spict(inp) ``` This imposes a Gaussian prior on `logK` with mean $\log(300)$ and standard deviation 2. The third entry indicates that the prior is used (1 means use, 0 means do not use). From the summary it is evident that the default priors were also imposed. ### Priors on random effects Priors can be applied to random effects of the model, i.e. `logB`, `logF`, `logBBmsy`, (which is $\log(B/Bmsy)$) `logFFmsy` (which is $\log(F/Fmsy)$). An additional argument is required to specify these priors ```{r, results='show', message=FALSE, warning=FALSE, fig.width=7, fig.height=3.5, fig.show='hold'} inp <- pol$albacore inp$priors$logB <- c(log(80), 0.1, 1, 1980) par(mfrow=c(1, 2), mar=c(5, 4.1, 3, 4)) plotspict.biomass(fit.spict(pol$albacore), ylim=c(0, 500)) plotspict.biomass(fit.spict(inp), qlegend=FALSE, ylim=c(0, 500)) ``` This imposes a Gaussian prior on `logB` with mean $\log(80)$, standard deviation 0.1 (very informative), the third entry in the vector indicates that the prior is used, the fourth entry indicates the year to which the prior should be applied, here 1980. It is clear from the plots that the prior influences the results significantly. Furthermore, it is not only the biomass in the year 1980 that is affected, but the information propagates forward and backward because all estimates are correlated. In reality such an informative prior is rarely available, however it may be possible to derive information about the absolute biomass from acoustic survey and swept area estimates. It is, however, critical that the standard deviation used reflects the quality of the information. ### Setting priors on the standard deviation of multiple indices It is possible to set a prior for on some or all noise terms of multiple biomass indices ```{r, results='show', message=FALSE, warning=FALSE, fig.width=6, fig.height=5, fig.show='hold'} set.seed(123) inp <- list(timeC=pol$albacore$timeC, obsC=pol$albacore$obsC) inp$timeI <- list(pol$albacore$timeI, pol$albacore$timeI[10:23]+0.25) inp$obsI <- list() inp$obsI[[1]] <- pol$albacore$obsI * exp(rnorm(23, sd=0.1)) # Index 1 inp$obsI[[2]] <- 10*pol$albacore$obsI[10:23] # Index 2 inp$priors$logsdi <- list(c(1, 1, 0), # No prior for index 1 c(log(0.1), 0.2, 1)) # Set a prior on index 2 res <- fit.spict(inp) sumspict.priors(res) ``` ### Fixing parameters using priors Model parameters can be fixed using `phases` as described previously. This technique can, however, only be used to fix model parameters and therefore not derived quantities such as `logalpha`, `logr` (which is derived from `logK`, `logm` and `logn`). Fixing a parameter can be regarded as imposing an highly informative prior to the parameter ```{r, results='show', message=FALSE, warning=FALSE, fig.width=7, fig.height=3.5, fig.show='hold'} inp <- pol$albacore inp$priors$logn <- c(log(2), 1e-3) inp$priors$logalpha <- c(log(1), 1e-3) inp$priors$logbeta <- c(log(1), 1e-3) fit.spict(inp) ``` The summary indicates that the priors are so informative that the quantities are essentially fixed. It is also noted that the estimates of these quantities are very close to the mean of their respective priors. ### Pitfalls when fixing parameters and specifying priors Particular caution is required when fixing a parameter that is highly correlated with other parameters because this will to some extent restrict the estimates of the correlated parameters. This could also be a problem when specifying priors depending on the amount of a priori information available. ## Robust estimation (reducing influence of extreme observations) The presence of extreme observations may inflate estimates of observation noise and increase the general uncertainty of the fit. To reduce this effect it is possible to apply a robust estimation scheme, which is less sensitive to extreme observations. An example with an extreme observation in the catch series is ```{r, results='show', message=FALSE, warning=FALSE, fig.width=7, fig.height=3.5, fig.show='hold'} inp <- pol$albacore inp$obsC[10] <- 3*inp$obsC[10] res1 <- fit.spict(inp) inp$robflagc <- 1 res2 <- fit.spict(inp) sumspict.parest(res2) par(mfrow=c(1, 2)) plotspict.catch(res1, main='Regular fit') plotspict.catch(res2, qlegend=FALSE, main='Robust fit') ``` It is evident from the plot that the presence of the extreme catch observation generally inflates the uncertainty of the estimated catches, while the robust fit is less sensitive. Robust estimation can be applied to index and effort data using `robflagi` and `robflage` respectively. Robust estimation is implemented using a mixture of light-tailed and a heavy-tailed Gaussian distribution as described in @spict. This entails two additional parameters (`pp` and `robfac`) that require estimation. This may not always be possible given the increased model complexity. In such cases these parameters should be fixed by setting their phases to `-1`. ## Forecasting and management scenarios All quantities estimated by SPiCT can be forecasted into the future. By default, the forecast period starts one time step after the last observation, which can be defined by the last index observation or by the end of the last catch or effort interval. Forecasting catch, fishing mortality, and biomass is a crucial step in fisheries management that allows determining future yields and risk of overfishing under current or alternative management scenarios. The following provides details to the forecasting and management functionality of SPiCT and includes examples based on the South Atlantic albacore data set of @polacheck1993, which contains catch and commercial CPUE data from 1967 to 1990. ### Forecasting Producing short-term forecasts entails minimal additional computing time and is part of the model fitting of SPiCT. The default settings can be altered by specifying the start and end of the forecast interval with the argument `maninterval` (= 'management interval'). For example, if a one year forecast of the catch during 2021 is of interest, then `maninterval` is set to represent the start and end of the forecast interval: `c(2021,2022)`. In addition to the forecast interval, a fishing scenario can be specified. This is done by specifying a factor (`ffac`) that is a fishing mortality multiplier at the start of the forecast period by. By default, the fishing mortality is unaltered and projected forward maintaining potential seasonal patterns. The time point of the forecasted biomass and fishing mortality can be controlled by setting `maneval` (= 'management evaluation'). The code to obtain the forecasted annual catch in the interval starting 1990 under a management scenario where the fishing pressure is reduced by 25% starting in 1990, and the time point of forecasted biomass and fishing mortality in 1991 is: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} inp <- pol$albacore inp$maninterval <- c(1990, 1991) inp$maneval <- 1991 inp$ffac <- 0.75 rep <- fit.spict(inp) ``` To specifically show forecast results use ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} sumspict.predictions(rep) ``` These predictions are also shown when using `summary(rep)` and as with any spict object the results can be plotted using `plot(rep)`. Here, however, we use a selection of 4 graphs to visualise the change in forecasted fishing mortality and associated change in forecasted catch more clearly: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=8, fig.height=7, fig.show='hold'} plot2(rep) ``` The dotted lines in the graph represent the predicted quantities for the forecast interval. Note the clear drop and subsequent constant fishing mortality representing the 25% reduction in fishing pressure as set with `ffac`. The decrease in fishing pressure results in constant (slightly increasing) biomass as opposed to the expected decrease if fishing effort had remained constant. The larger confidence intervals indicate the larger uncertainty associated with model predictions without data. Setting `ffac` (or `fcon` for changing the fishing mortality by an absolute value instead of a factor) in the input list affects any SPiCT fit with this input list. This can be reversed back to the default (continuing the F process) by setting `inp$ffac <- NULL`. In the following section, we introduce a range of different functions which make the exploration of different F factors and other management strategies more straight-forward and do not affect the fitted SPiCT object or its input list. ### Management In addition to manually changing the forecast period and fishing factor as described above, SPiCT includes a wide range of functions related to fisheries management. The `manage` function applies 8 different management scenarios and allows to explore the effect of different management strategies on the recommended Total Allowable Catch (TAC), or predicted fishing mortality or biomass. The scenarios include advice rules such as constant catch, no fishing or the current advice rule recommended by the export group for data limited fisheries management in ICES [@wklifeix]. The results of this function are appended to the `spictcls` object, thus the application is straight-forward: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} inp <- pol$albacore rep <- fit.spict(inp) rep <- manage(rep) ``` where `rep` is the result of `fit.spict()` from the code above. The `spictcls` object fitted in each scenario is stored in a list called `man` within `rep`. The argument `scenarios` controls which scenarios are included; it is a vector of scenario numbers, e.g. `scenarios = c(1,5,2,8)`, or scenario names, e.g. `scenarios = c("noF", "ices", "incrF25")`. The list of the predefined management scenarios in the `manage` function with index and name are: \begin{enumerate} \item \textbf{currentCatch}: Keep the catch of the current year (i.e. the last observed catch). \item \textbf{currentF}: Keep the F of the current year. \item \textbf{Fmsy}: Fish at Fmsy i.e. F=Fmsy. \item \textbf{noF}: No fishing, reduce to 1\% of current F. \item \textbf{reduceF25}: Reduce F by 25\%. \item \textbf{increaseF25}: Increase F by 25\%, \item \textbf{msyHockeyStick}: Use ICES MSY hockey-stick advice rule [@msycat34]. \item \textbf{ices}: Use ICES MSY 35th hockey-stick advice rule [@wklifeix]. \end{enumerate} More information about these advice rules and other functionality of the `manage` function can be found in the function documentation (`?manage`). The results of the management scenarios can be summarised by: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} sumspict.manage(rep) ``` The summary function prints a timeline as well as the predicted catch during the management interval and absolute and relative biomass and fishing mortality at the management evaluation time for each scenario. Additionally, the quantities '`perc.dB`' and '`perc.dF`' show the percentage change in biomass and fishing mortality from the start to the end of the management period for each management scenario, respectively. The implications of the different management scenarios can also be shown graphically by calling the `plot` function or any plotting function of the `plotspict.` family (e.g. `plotspict.ffmsy`) on a spict object with a `man` element: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=8, fig.height=7, fig.show='hold'} plot2(rep) ``` If any plotting function of the `plotspict.` family is called on a SPiCT object which includes management scenarios (`rep$man != NULL`), the prediction and confidence intervals of the base scenario (`rep`) are omitted from the plot and instead the projections of the different management scenarios are displayed in different colours. The grey vertical line corresponding to the time of the last observation in the standard plots is replaced by two lines which corresponds to the start and end of the management interval. In some cases, there might only be one vertical line displayed in the catch plot or the trajectories of the scenarios might be missing in the catch plot. Refer to the function documentation if that is the case (`help(plotspict.catch)`). In order to go back to the previous plots, one can remove the management scenarios (see below) or assign the results of the `manage` function to a different object, e.g. `man <- manage(rep)` and have both plotting functionalities with `plot(rep)` and `plot(man)`. The function `plotspict.hcr` (spict version >= 1.3.4) provides a visual summary of the management scenarios. ```{r, results='show', message=FALSE, warning=FALSE, fig.width=8, fig.height=7, fig.show='hold'} plotspict.hcr(rep) ``` #### Assessment and intermediate period Besides defining the assessment period in the input data as shown above, it can also be changed in the `manage` function directly using the argument `maninterval`. The management period or interval can correspond to any period from a fraction of a year to spanning over several years (see below for more examples of variable assessment periods). To visualise the management period the function `man.timeline` can be applied to an input list: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} man.timeline(inp) ``` or to a fitted spict object: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} man.timeline(rep) ``` As the output shows, the management period period starts right after the time of the last observation, in this example in 1990. In reality, however, there is often a lag between the data and management period. In fact, the assessment is often performed within the gap year and the management scenarios are supposed to start at the beginning of the following year. Thus, it makes sense to distinguish the intermediate period from the management period. For the albacore data set we could assume a management period starting in 1991, which would change the time line as follows: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} inp$maninterval <- c(1991,1992) man.timeline(inp) ``` As can be seen, the timeline includes an additional section called 'Intermediate' referring to the intermediate period. Management scenarios with an intermediate period from 1990 to 1991 and an assessment period can thus be specified for the albacore data set with: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} repIntPer <- manage(rep, scenarios = c(1,2), maninterval = c(1991,1992), maneval = 1992) plotspict.catch(repIntPer) ``` It is important to note that defining a management interval a few time steps after the time of the last catch observation requires the model to make an assumption about the fishing mortality during the intermediate period. By default, SPiCT continues the fishing mortality process in the intermediate period, i.e. the fishing mortality corresponds to the estimated F at the time step of the last observation and the estimated seasonal F process. However, in some cases it might be meaningful to assume a certain catch during this period rather than continuing the F process. This might be in particular relevant when information about the catch in the intermediate period is available, for example in form of a TAC which was recommended and implemented the year before. Therefore, a specific catch can be specified for the intermediate period using the argument `intermediatePeriodCatch`. ```{r, results='show', message=FALSE, warning=FALSE, fig.width=9, fig.height=5, fig.show='hold'} repIntPerC <- manage(rep, scenarios=c(8), maninterval = c(1991,1992), intermediatePeriodCatch = 20) par(mfrow=c(1,2)) plotspict.biomass(repIntPerC) plotspict.catch(repIntPerC) ``` As the graph shows, the catch in the intermediate period has been set to 20t (year 1990). Dotted lines of the management scenarios reflect the intermediate period, while solid lines reflect the management period. Be aware that the vertical bars indicating the start and end of the management period might seem off by a year for the catch plot, however, they correspond to the timing of the catch observations which are drawn at the beginning of the catch interval they refer to. Note that the specified catch corresponds to the complete intermediate period, which might be a year, but could also be quarter of a year or 3 years depending on the time of the last observation and the start of the management period (see below for examples with variable intermediate periods). Note further, that the catch used for the scenario 'currentCatch' is the last observed catch even though a certain catch is provided in the intermediate period. The argument `intermediatePeriodCatchList` allows to specify any number of catch 'observations' corresponding to any interval during the intermediate period. It is a list with the elements 'obsC', 'timeC', 'dtc' and the optional element 'stdevfacC' (which is equal to 1 if not provided). Find more information about this and related arguments in the documentation of the `manage` function. #### Custom management scenarios The default management scenarios included in the `manage` function are a selection of advice rules that are relevant to managers and stakeholders, in particular, based on our experience with stocks managed by ICES. However, there is no limitation in terms of custom advice rules and management scenarios. The function `add.man.scenario` allows to define a wide range of management scenarios and creates a new spict object, which can easily be added to exiting scenarios in `rep$man` (contrary to `manage` which overwrites all scenarios in `rep$man`). For example, to add an additional management scenario to `repIntPer` which increases fishing mortality by 50%: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} repIntPer <- add.man.scenario(repIntPer, ffac = 1.5) sumspict.manage(repIntPer) plotspict.f(repIntPer) ``` Note that the default name of a scenario 'customScenario_1' was used for this management scenario, where '1' represents the number of scenarios in `rep$man` with this default label. By using the argument `scenarioTitle` you can specify any label for your scenario. For example, to define a scenario that reduces the last observed catch by 64% and is labelled 'reduced_catch': ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} repIntPer <- add.man.scenario(repIntPer, cfac = 0.64, scenarioTitle = "reduced_catch") names(repIntPer$man) ``` Be aware that management scenarios in the list `rep$man` can be overwritten if a scenario with the specified `scenarioTitle` is already present in the list. A few important arguments of the `add.man.scenario` function to customise management scenarios are: * __ffac__: Factor to multiply current fishing mortality by. * __cfac__: Factor to multiply current catch by. * __breakpointBBmsy__: Breakpoint in terms of $B/B_{MSY}$ for the hockey-stick HCR. By default no breakpoint is assumed. * __safeguardB__: List defining an optional precautionary buffer by means of a biomass reference level relative to $B/B_{MSY}$ (`limitB`) and the risk aversion probability (`prob`): * __limitB__: Reference level for the evaluation of the predicted biomass defined as fraction of $B/B_{MSY}$. By default the PA buffer is not used. Theoretically, any value smaller than 1 is meaningful, but an ICES recommended value would be 30% `safeguardB$limitB = 0.3` @[wklifeiv]. * __prob__: Risk aversion probability of the predicted biomass relative to specified reference level (`limitB`) for all rules with PA buffer. The default value is 0.95 as recommended by [@wklifeix]. * __fractiles__: List defining the fractiles of the 3 distributions of `catch`, `bbmsy`, and `ffmsy`: * __catch__: Fractile of the predicted catch distribution. * __bbmsy__: Fractile of the $B/B_{MSY}$ distribution. * __ffmsy__: Fractile of the $F/F_{MSY}$ distribution. Note that the fractile for the $F/F_{MSY}$ distribution is actually 1 minus the fractile specified. Otherwise, a larger fractile would be more precautionary than a smaller one, as the current fishing mortality is divided by the value of this distribution ($F_{y+1} = \frac{F_y}{\frac{F_y}{F_{MSY}}}$). This allows a consistent setting of fractiles among the different quantities. The advice rule recommended by @wklifeix can thus be defined by the combination of several of these arguments: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} repIntPer <- add.man.scenario(repIntPer, scenarioTitle = "ices", breakpointB = 0.5, fractiles = list(catch=0.35, bbmsy=0.35, ffmsy=0.35)) sumspict.manage(repIntPer) ``` A detailed description of all arguments of this function can be found in the function documentation (run `?add.man.scenario`). #### Variable management periods As mentioned above, the management period (and intermediate period) is not limited to the length of one year nor to the start of a year, but can span half a year, several years or start within any point in time within the year (cf. in-year advice @wklifeix). For example, applying two default scenarios for a half year period starting in the middle of 1990: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} repVarPer <- manage(rep, scenarios = c(1,4), maninterval = c(1990.5,1991)) ``` Scenarios with different management intervals can also be combined, although the functions will print a note about mismatching periods or assumptions in the intermediate period. For example, to add an advice rule which reduces the F linearly to zero if the biomass is below 50% of $B/B_{MSY}$ over a management interval of 2 years to the previous scenarios: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} repVarPer <- add.man.scenario(repVarPer, maninterval = c(1991, 1993), breakpointB = 0.5) plotspict.f(repVarPer) ``` Additionally, the summary output will not show percentage differences for F or B (`NaN`) and the timeline will only show the period with observations as the intermediate and management period differ between the scenarios. ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} sumspict.manage(repVarPer) ``` #### Advanced management settings and functionality Besides selecting scenarios in `manage` using the argument `scenarios`, the function `man.select` can be used to change the order of scenarios or omit certain or all scenarios included in `rep$man`: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} names(rep$man) rep <- man.select(rep, scenarios = c(1,3)) ``` One of the important quantities in fisheries management is the predicted catch during the management interval under given fishing mortality, also called Total Allowable Catch (TAC). The function `man.tac` allows to extract a list with the TAC of all scenarios in `rep$man`: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} man.tac(rep) ``` Note that the unit of the TAC corresponds to the catch unit specified in `inp$catchunit` or the unit of the catch observations if not specified. If the user is only interested in the TAC of the specified management scenario (e.f. for the implementation in a management strategy framework (MSE)), the function `get.TAC` can be used: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} get.TAC(rep, maninterval = c(1990.25, 1991), safeguardB = list(limitB = 0.3), verbose = TRUE) ``` After comparing the implications of different management scenarios, it might be of interest to extract a specific scenario for further analyses, plotting, or reporting. Specific scenarios can be extracted from the SPiCT object by referencing their names or positioning in the list `rep$man`, e.g. `rep$man[[1]]`. However, care has to be taken, as some scenarios (e.g. 'constantCatch' or scenarios with a specified catch in the intermediate period) might include additional catch observations. Thus, the best way to select certain scenarios is by means of the function `man.select`. This function does not only allow to change the order of the management scenarios in `rep$man` or make a selection of scenarios, but also allows to extract a certain scenario as a standard SPiCT object of the class `spictcls`: ```{r, results='show', message=FALSE, warning=FALSE, fig.width=8, fig.height=8, fig.show='hold'} ## selection by index length(repIntPer$man) repSelect1 <- man.select(repIntPer, scenarios = c(3,1)) ## selection by scenario name names(repIntPer$man) repSelect2 <- man.select(repIntPer, scenarios = c("currentCatch")) ## selected object as standard spictcls object repSelect3 <- man.select(repIntPer, scenarios = c("currentCatch"), spictcls = TRUE) ## Plot objects with selected scenarios opar <- par(mfrow=c(3,1)) plotspict.catch(repSelect1) plotspict.catch(repSelect2) plotspict.catch(repSelect3) par(opar) ``` Note the different catch trajectories of the second and third plot. Although, the same scenario was selected, the first approach still shows the management-type plot, while the second approach (with argument `spictcls = TRUE`) shows the default 'spictcls' catch plot with predicted uncertainties. All management related results can easily be removed from the SPiCT object by overwriting the `man` element: `rep$man = NULL` or ```{r, results='show', message=FALSE, warning=FALSE, fig.width=5.5, fig.height=5, fig.show='hold'} rep <- man.select(rep, scenarios = NULL) ``` # Other model settings and options ## `catchunit` - Define unit of catch observations This will print the unit of the catches on relevant plots. Example: `inp$catchunit <- "'000 t"`. ## `dteuler` and `eulertype` - Temporal discretisation and time step To solve the continuous-time system an Euler discretisation scheme is used. This requires a time step to be specified (`dteuler`). The smaller the time step the more accurate the approximation to the continuous-time solution, however with the cost of increased memory requirements and computing time. The default value of `dteuler` is $1/16$, which seems sufficiently fine for most cases, and perhaps too fine for some cases. When fitting quarterly data and species with fast growth it is important to have a small `dteuler`. The influence of `dteuler` on the results can be checked by using different values and comparing resulting model estimates. If `dteuler <- 1` the model essentially becomes a discrete-time model with one Euler step per year. There are two possible temporal discretisation schemes which can be set to either `eulertype = 'hard'` (default) or `eulertype = 'soft'`. If `eulertype = 'hard'` then time is discretised into intervals of length `dteuler`. Observations are then assigned to these intervals. For annual and quarterly data `dteuler = 1/16` is appropriate, however if fitting to monthly data `dteuler` should be changed to e.g. `1/24`. If `eulertype = 'soft'` (careful, this feature has not been thoroughly tested), then time is discretised into intervals of length `dteuler` and additional time points corresponding to the times of observation are added to the discretisation. This feature is particularly useful if observations (most likely index series) are observed at odd times during the year. The model then estimates values of biomass and fishing mortality at the exact time of the observation instead of assigning the observation to an interval. ## `msytype` - Stochastic and deterministic reference points As default the stochastic reference points are reported and used for calculation of relative levels of biomass and fishing mortality. It is, however, possible to use the deterministic reference points by setting `inp$msytype <- 'd'`. ## `do.sd.report` - Perform SD report calculations The sdreport step calculates the uncertainty of all quantities that are reported in addition to the model parameters. For long time series and with small `dteuler` this step may have high memory requirements and a substantial computing time. Thus, if one is only interested in the point estimates of the model parameters it is advisable to set `do.sd.report <- 0` to increase speed. ## `reportall` - Report all derived quantities If uncertainties of some quantities (such as reference points) are required, but uncertainty on state variables (biomass and fishing mortality) are not needed, then `reportall <- 0` can be used to increase speed. ## `optim.method` - Report all derived quantities Parameter estimation is per default performed using R's `nlminb()` optimiser. Alternatively it is possible to use `optim` by setting `inp$optim.method <- 'optim'`. # References