CropGrowth.jl

CropGrowth.jl

CropGrowth.jl is a Julia package for modelling crop growth curves using the generalised logistic function:

Quickstart

using Pkg
Pkg.add(CropGrowth)
using CropGrowth
df = simulate()
df_out = fitgrowthmodels(df, verbose=true)

Model

\[y(t) = {A + {{K-A} \over {C + (Qe^{-Bt})^{1/v}}}}\]

where:

$y(t)$: biomass at time $t$
$A$: lower asymptote (initial or minimum biomass)
$K$: positively affects the upper asymptote. This is the final or maximum biomass if:
- $C = 1.00$, since:
- $y_{max} = A + (K-A)/C^{1/v}$, then
- $y_{max} = A + K - A$, therefore:
- $y_{max} = K$
$C$: negatively affects the final or maximum biomass
$Q$: negatively affects initial or minimum biomass
$e$: Euler's number (~2.71828)
$B$: growth rate
$v$: asymmetry parameter ($v ≥ 0$; small values: fast growth early; large values: fast growth later)

To solve for $t$ at specific $y$:

\[t(y) = -{{1} \over {B}} \log {\left( { {{1} \over {Q}} \left( \left( {K - A} \over {y - A} \right)^v - C \right) } \right) }\]

Installation

Please install the latest version of Julia (we aim to update CropGrowth.jl for the latest Julia release):

curl -fsSL https://install.julialang.org | sh
type -a julia

Install the CropGrowth.jl library in Julia:

Stable version:

using Pkg
Pkg.add(CropGrowth)

Development version:

using Pkg
Pkg.add("https://github.com/jeffersonfparil/CropGrowth.jl")

Test the installation in Julia:

using CropGrowth
df = simulate()
df_out = fitgrowthmodels(df, verbose=true)

The main function, fitgrowthmodels fits generalized logistic growth models to the data provided in the input DataFrame and returns a DataFrame containing the fitted parameters, fit statistics, and time to reach specified fraction of the final value. The complete function signature is:

fitgrowthmodels(
  df::DataFrame;
  A = Dict(
      :init=>minimum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
      :lower=>0.0,
      :upper=>maximum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
  ),
  K = Dict(
      :init=>maximum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
      :lower=>0.0,
      :upper=>2*maximum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
  ),
  C = Dict(:init=>1.0, :lower=>1.0, :upper=>1.0),
  Q = Dict(:init=>1.0, :lower=>1.0, :upper=>1.0),
  B = Dict(:init=>1.0, :lower=>0.0, :upper=>10.0),
  v = Dict(:init=>1.0, :lower=>1e-5, :upper=>10.0),
  min_t::Int64 = 3,
  frac_of_final::Vector{Float64} = [0.5, 0.9],
  fit_statistic::String = "R²",
  maxiters::Int64 = 10_000,
  seed::Int64 = 42,
  show_plots::Bool = false,
  verbose::Bool = false,
)::Tuple{DataFrame, Vector{String}}

Arguments

df::DataFrame: Input data containing the required columns specified in REQUIRED_COLUMNS = ["entries", "sites", "replications", "growing_periods", "time_points"] and at least one trait column.
A::Dict: Search space for the parameter A (lower asymptote). Contains :init, :lower, and :upper keys. Defaults to the minimum and maximum of the trait column with init=minimum.
K::Dict: Search space for the parameter K (upper asymptote). Contains :init, :lower, and :upper keys. Defaults to the minimum and 2×maximum of the trait column with init=maximum.
C::Dict: Search space for the parameter C. Contains :init, :lower, and :upper keys. Defaults to init=1.0, lower=1.0, upper=1.0.
Q::Dict: Search space for the parameter Q. Contains :init, :lower, and :upper keys. Defaults to init=1.0, lower=1.0, upper=1.0.
B::Dict: Search space for the parameter B (growth rate). Contains :init, :lower, and :upper keys. Defaults to init=1.0, lower=0.0, upper=10.0.
v::Dict: Search space for the parameter v (asymmetry parameter). Contains :init, :lower, and :upper keys. Defaults to init=1.0, lower=1e-5, upper=10.0.
min_t::Int64: Minimum number of time points required to fit the growth model for a specific combination of entry, site, replication, and growing period. Defaults to 3.
frac_of_final::Vector{Float64}: Percentages of the final value for which the time to reach these fraction will be calculated. Defaults to [0.5, 0.9].
fit_statistic::String: The fit statistic to be used for evaluating the model. Must be one of ["R²", "RMSE", "MSE", "MAE", "ρ"]. Defaults to "R²".
maxiters::Int64: Maximum number of iterations allowed for the optimization process. Defaults to 10_000.
seed::Int64: Random seed for reproducibility. Defaults to 42.
show_plots::Bool: Whether to show fitted growth curve plots. Defaults to false.
verbose::Bool: Whether to display progress and additional information during the fitting process. Defaults to false.

Returns

Tuple{DataFrame, Vector{String}}:

The first element is a DataFrame containing the fitted parameters (A, K, C, Q, B, v), fit statistics, value of the growth models at $t=0$ (y_t0), maximum value of the growth model (y_max), and time to reach specified fraction of the final value for each combination of entry, site, replication, and growing period.
The second element is a Vector{String} containing the combinations that were skipped due to insufficient data points.

Notes

The input DataFrame must contain the required columns specified in the global variable REQUIRED_COLUMNS = ["entries", "sites", "replications", "growing_periods", "time_points"], as well as at least one additional trait column.
If the DataFrame contains more than one trait column, only the first trait column will be used.
Combinations with fewer than min_t time points will be skipped.
The function uses a progress bar to indicate the fitting process if verbose=true.
The optimisation is performed using the BBO_adaptive_de_rand_1_bin_radiuslimited() algorithm (details of the optimisation algorithm).
The optimisation algorithm minimises the mean squared error between the observed data y and the generalised logistic model.

Examples

using CropGrowth, StatsBase
df = simulate(n_entries=5, n_sites=1, n_replications=1, n_growing_periods=1, n_time_points_per_growing_period=5, seed=123)
df_out_0, skipped_combinations_0 = fitgrowthmodels(df, show_plots=true)
df_out_C, skipped_combinations_C = fitgrowthmodels(df, C=Dict(:init=>1.0, :lower=>0.0, :upper=>100.0), show_plots=true)
df_out_Q, skipped_combinations_Q = fitgrowthmodels(df, Q=Dict(:init=>1.0, :lower=>0.0, :upper=>100.0), show_plots=true)
df_out_CQ, skipped_combinations_CQ = fitgrowthmodels(
    df,
    C=Dict(:init=>1.0, :lower=>0.0, :upper=>5.0),
    Q=Dict(:init=>1.0, :lower=>0.0, :upper=>5.0),
    show_plots=true
)
mean(df_out_0.R²)
mean(df_out_C.R²)
mean(df_out_Q.R²)
mean(df_out_CQ.R²)

Comparison with `nplr`

TLDR: CropGrowth.jl fits a more general logistic function than nplr.

The R package nplr: n-parameter logistic regressions fits a variant of the logistic model. It fits 5 parameters instead of the 6 parameters in CropGrowth.jl's formulation above. The model is defined as:

\[y(x) = { B + { {T - B} \over {\left [ 1 + 10^{b(x_{mid} - x)} \right ]^s} } }\]

where:

$B$: lower asymptote
$T$: upper asymptote
$b$: Hill slope (related to B or growth rate)
$x_{mid}$: x-coordinate at the inflexion point,
$s$ asymetry coefficient (related to v)

Fitting log10(dose)-response curve with nplr:

library(nplr)
library(txtplot)
path <- system.file("extdata", "pc3.txt", package="nplr")
pc3 <- read.delim(path)
txtplot::txtplot(log10(pc3$CONC), pc3$GIPROP) # nplr default does log10(x)
np1 <- nplr::nplr(x=pc3$CONC, y=pc3$GIPROP, useLog=TRUE)
y = pc3$GIPROP
y_hat = np1@yFit
print("Fitted parameters:")
print(paste0("K = ", np1@pars$bottom))
print(paste0("A = ", np1@pars$top))
print(paste0("xmid = ", np1@pars$xmid))
print(paste0("B_ish = ", np1@pars$scal))
print(paste0("v_ish = ", np1@pars$scal))
print("Fit statistics:")
print(paste0("mae = ", mean(abs(y - y_hat))))
print(paste0("mse = ", mean((y - y_hat)^2)))
print(paste0("rmse = ", sqrt(mean((y - y_hat)^2))))
print(paste0("cor = ", cor(y, y_hat)))
print(paste0("r2 = ", 1 - (var((y - y_hat)) / var(y))))
# [1] "Fitted parameters:"
# [1] "K = 0.000182956669824204"
# [1] "A = 0.996490622312222"
# [1] "xmid = -6.18254538738321"
# [1] "B_ish = -1.42800873142546"
# [1] "v_ish = -1.42800873142546"
# [1] "Fit statistics:"
# [1] "mae = 0.0186073781104083"
# [1] "mse = 0.000651821232462819"
# [1] "rmse = 0.0255307898910868"
# [1] "cor = 0.997250250732751"
# [1] "r2 = 0.994444954867448"

CropGrowth.jl achieves very similar but more generalised fit:

using DataFrames, CSV, CropGrowth
df = CSV.read(expanduser("~/.conda/envs/GenomicBreeding/lib/R/library/nplr/extdata/pc3.txt"), DataFrame)
growth_model = modelgrowth(
    t = log10.(df.CONC),
    y = df.GIPROP,
    maxiters = 100_000,
    seed = 42,
    verbose = true,
)
# Fitted parameters:
# A = 0.991559343580949
# K = 0.0380737475678395
# C = 1.0
# Q = 1.5219825671137292e-5
# B = 1.5916146320430173
# v = 0.15864685998073122
# y_t0 = 0.03816521536021922
# y_max = 0.038073747568793115
# Fit statistics:
# mse = 0.0006662392521457019
# rmse = 0.0258116108010659
# R² = 0.9943191952067459
# ρ = 0.9971555521616207
# mae = 0.020947684621310015

CropGrowth.GrowthModel — Type

mutable struct GrowthModel
    A::Float64
    K::Float64
    C::Float64
    Q::Float64
    B::Float64
    v::Float64
    y_t0::Float64
    y_max::Float64
    fit_statistics::Dict{String,Float64}

Holds parameters for the generalised logistic growth model:

\[y(t) = A + \frac{K-A}{C + (Qe^{-Bt})^{1/v}}\]

where:

$y(t)$: biomass at time $t$ (not part of the struct)
$A$: lower asymptote (initial or minimum biomass)
$K$: positively affects the upper asymptote (can be the final or maximum biomass if $C = 1.00$, since $y_{max} = A + (K-A)/C^(1/v)$, then $y_{max} = A + K - A$, therefore: $y_{max} = K$)
$C$: negatively affects the final or maximum biomass
$Q$: negatively affects initial or minimum biomass
$e$: Euler's number (~2.71828)
$B$: growth rate
$v$: asymmetry parameter ($v ≥ 0$; small values: fast growth early; large values: fast growth later)

Additional information are provided in the struct:

y_t0: value of the growth model at time $t = 0$
y_max: maximum value of the growth model ($y_{max} = A + (K-A)/C^{(1/v)}$)
fit_statistics: a dictionary containing fit statistics such as R², RMSE, MSE, MAE, and Pearson's correlation coefficient (ρ)

Constructor

GrowthModel(;
    A::Float64,
    K::Float64,
    C::Float64,
    Q::Float64,
    B::Float64,
    v::Float64,
    fit_statistics::Dict{String,Float64} = Dict(""=>NaN),
    ϵ::Float64 = 1e-12,
)

Notes:

The struct includes a constructor that automatically calculates y_max based on the provided parameters.
The fit_statistics field is optional and defaults to a dictionary with a single entry of an empty string key and NaN value.
A small epsilon value (ϵ) is added to the calculation of y_max to prevent division by zero errors.

Example

julia> growth_model = GrowthModel(A=0.0, K=10.0, C=1.0, Q=1.0, B=0.75, v=0.1);

julia> (growth_model.y_max - growth_model.K) < 1e-9
true

julia> (growth_model.y_max - 10.0) < 1e-9
true

source

CropGrowth.fitgrowthmodels — Method

fitgrowthmodels(
    df::DataFrame;
    A = Dict(
        :init=>minimum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
        :lower=>0.0,
        :upper=>maximum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
    ),
    K = Dict(
        :init=>maximum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
        :lower=>0.0,
        :upper=>2*maximum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
    ),
    C = Dict(:init=>1.0, :lower=>1.0, :upper=>1.0),
    Q = Dict(:init=>1.0, :lower=>1.0, :upper=>1.0),
    B = Dict(:init=>1.0, :lower=>0.0, :upper=>10.0),
    v = Dict(:init=>1.0, :lower=>1e-5, :upper=>10.0),
    min_t::Int64 = 3,
    frac_of_final::Vector{Float64} = [0.5, 0.9],
    fit_statistic::String = "R²",
    maxiters::Int64 = 10_000,
    seed::Int64 = 42,
    show_plots::Bool = false,
    verbose::Bool = false,
)::Tuple{DataFrame, Vector{String}}

Fits generalised logistic growth models to the data provided in the input DataFrame and returns a DataFrame containing the fitted parameters, fit statistics, and time to reach specified fractions of the final value.

Arguments

df::DataFrame: Input data containing the required columns specified in REQUIRED_COLUMNS = ["entries", "sites", "replications", "growing_periods", "time_points"] and at least one trait column.
A::Dict: Search space for the parameter A (lower asymptote). Contains :init, :lower, and :upper keys. Defaults to the minimum and maximum of the trait column with init=minimum.
K::Dict: Search space for the parameter K (upper asymptote). Contains :init, :lower, and :upper keys. Defaults to the minimum and 2×maximum of the trait column with init=maximum.
C::Dict: Search space for the parameter C. Contains :init, :lower, and :upper keys. Defaults to init=1.0, lower=1.0, upper=1.0.
Q::Dict: Search space for the parameter Q. Contains :init, :lower, and :upper keys. Defaults to init=1.0, lower=1.0, upper=1.0.
B::Dict: Search space for the parameter B (growth rate). Contains :init, :lower, and :upper keys. Defaults to init=1.0, lower=0.0, upper=10.0.
v::Dict: Search space for the parameter v (asymmetry parameter). Contains :init, :lower, and :upper keys. Defaults to init=1.0, lower=1e-5, upper=10.0.
min_t::Int64: Minimum number of time points required to fit the growth model for a specific combination of entry, site, replication, and growing period. Defaults to 3.
frac_of_final::Vector{Float64}: Percentages of the final value for which the time to reach these fractions will be calculated. Defaults to [0.5, 0.9].
fit_statistic::String: The fit statistic to be used for evaluating the model. Must be one of ["R²", "RMSE", "MSE", "MAE", "ρ"]. Defaults to "R²".
maxiters::Int64: Maximum number of iterations allowed for the optimisation process. Defaults to 10_000.
seed::Int64: Random seed for reproducibility. Defaults to 42.
show_plots::Bool: Whether to show fitted growth curve plots. Defaults to false.
verbose::Bool: Whether to display progress and additional information during the fitting process. Defaults to false.

Returns

Tuple{DataFrame, Vector{String}}:
- The first element is a DataFrame containing the fitted parameters (A, K, C, Q, B, v), fit statistics, value of the growth models at $t=0$ (y_t0), maximum value of the growth model (y_max), time to reach specified fractions of the final value (time_to_*p), and number of time-points (number_of_time_points) for each combination of entry, site, replication, and growing period.
- The second element is a Vector{String} containing the combinations that were skipped due to insufficient data points.

Notes

The input DataFrame must contain the required columns specified in the global variable REQUIRED_COLUMNS = ["entries", "sites", "replications", "growing_periods", "time_points"], as well as at least one additional trait column.
If the DataFrame contains more than one trait column, only the first trait column will be used.
Combinations with fewer than min_t time points will be skipped.
The function uses a progress bar to indicate the fitting process if verbose=true.
The optimisation is performed using the BBO_adaptive_de_rand_1_bin_radiuslimited() algorithm (details of the optimisation algorithm).
The optimisation algorithm minimises the mean squared error between the observed data y and the generalised logistic model.
Parallel curve fitting:
- The per-combination fits are executed in parallel using Threads.@threads over the prepared list of filter combinations. To take advantage of this, start Julia with multiple threads (for example julia -t 4 or set JULIA_NUM_THREADS).
- Access to the shared fitted_parameters dictionary is protected with a ReentrantLock and @lock to ensure thread-safety when appending results. This avoids race conditions but introduces some serialization for the final writes.

Details

Fits a generalised logistic growth model to the data using the following equation:

\[y(t) = A + \frac{K-A}{C + (Qe^{-Bt})^{1/v}}\]

where:

$y(t)$: biomass at time $t$
$A$: lower asymptote (initial or minimum biomass)
$K$: positively affects the upper asymptote (can be the final or maximum biomass if $C = 1.00$, since $y_{max} = A + (K-A)/C^(1/v)$, then $y_{max} = A + K - A$, therefore: $y_{max} = K$)
$C$: negatively affects the final or maximum biomass
$Q$: negatively affects initial or minimum biomass
$e$: Euler's number (~2.71828)
$B$: growth rate
$v$: asymmetry parameter ($v ≥ 0$; small values: fast growth early; large values: fast growth later)

Additional information are provided in the output DataFrame:

y_t0 is the value of the growth model at time $t = 0$
y_max is the maximum value of the growth model ($y_{max} = A + (K-A)/C^(1/v)$)
One of the following fit statistic such as R² (default), RMSE, MSE, MAE, and Pearson's correlation coefficient (ρ)

Example

julia> df = simulate(n_entries=5, seed=42);

julia> df_out, skipped_combinations = fitgrowthmodels(df);

julia> length(unique(df.entries)) == length(unique(df_out.entries))
true

julia> length(unique(df.sites)) == length(unique(df_out.sites))
true

julia> length(unique(df.replications)) == length(unique(df_out.replications))
true

julia> length(unique(df.growing_periods)) == length(unique(df_out.growing_periods))
true

julia> all(x -> x ∈ names(df_out), ["A", "K", "C", "Q", "B", "v", "y_max", "R²", "time_to_50p", "time_to_90p"])
true

julia> (var(df_out.A) > 0.0) && (var(df_out.K) > 0.0) && (var(df_out.C) == 0.0) && (var(df_out.Q) == 0.0) && (var(df_out.B) > 0.0) && (var(df_out.v) > 0.0)
true

julia> (var(df_out.y_max) > 0.0) && (var(df_out."R²") > 0.0) && (var(df_out.time_to_50p) > 0.0) && (var(df_out.time_to_90p) > 0.0)
true

source

CropGrowth.fitstatistics — Method

fitstatistics(; y::Vector{Float64}, ŷ::Vector{Float64})::Dict{String, Float64}

Compute various fit statistics for a given set of observed and predicted values.

Arguments

y::Vector{Float64}: Observed data points.
ŷ::Vector{Float64}: Predicted data points.

Returns

A dictionary containing the following fit statistics:

"R²": Coefficient of determination, a measure of how well the predicted values explain the variance in the observed data.
"rmse": Root mean squared error, a measure of the average magnitude of the residuals.
"mse": Mean squared error, the average of the squared residuals.
"mae": Mean absolute error, the average of the absolute residuals.
"ρ": Pearson correlation coefficient between the observed and predicted values.

Notes

Ensure that the y and ŷ vectors are of the same length and correspond to the same observations.
The function assumes that both y and ŷ are non-empty vectors.

Example

julia> y1::Vector{Float64} = rand(10); y2::Vector{Float64} = rand(10);

julia> stats0 = fitstatistics(y=y1, ŷ=y1);

julia> stats1 = fitstatistics(y=y1, ŷ=y2);

julia> stats0["R²"] > stats1["R²"]
true

julia> stats0["rmse"] < stats1["rmse"]
true

julia> stats0["mse"] < stats1["mse"]
true

julia> stats0["mae"] < stats1["mae"]
true

julia> stats0["ρ"] > stats1["ρ"]
true

source

CropGrowth.generalisedlogistic — Method

generalisedlogistic(growth_model::GrowthModel; t::Vector{Float64})::Vector{Float64}

Computes the generalised logistic function for a given vector of time points t using the parameters defined in the GrowthModel.

Arguments

growth_model::GrowthModel: A struct containing the parameters for the generalised logistic function.
t::Vector{Float64}: A vector of time points at which to evaluate the function.

Returns

Vector{Float64}: A vector of values representing the generalised logistic function evaluated at each time point in t.

Returns

Vector{Float64}: A vector of values representing the generalised logistic function evaluated at each time point in t.

Example

julia> t = collect(0.0:1.0:10.0);

julia> growth_model = GrowthModel(A=0.0, K=10.0, C=1.0, Q=1.0, B=0.75, v=0.1);

julia> y = generalisedlogistic(growth_model, t=t);

julia> (y[1] == minimum(y)) && (y[end] == maximum(y))
true

source

CropGrowth.generalisedlogistic — Method

generalisedlogistic(t::Vector{Float64}; A=0.0, K=1.0, C=1.0, Q=1.0, B=1.0, v=1.0)::Vector{Float64}

Computes the generalised logistic function for a given vector of time points t.

Arguments

t::Vector{Float64}: A vector of time points at which to evaluate the function.

Keyword Arguments

A::Float64 (default: 0.0): lower asymptote (initial or minimum biomass)
K::Float64 (default: 1.0): upper asymptote (can be the final or maximum biomass if $C = 1.00$, since $y_{max} = A + (K-A)/C^(1/v)$, then $y_{max} = A + K - A$, therefore: $y_{max} = K$)
C::Float64 (default: 1.0): negatively affects the final or maximum biomass
Q::Float64 (default: 1.0): negatively affects initial or minimum biomass
B::Float64 (default: 1.0): growth rate
v::Float64 (default: 1.0): asymmetry parameter ($v ≥ 0$; small values: fast growth early; large values: fast growth later)

Returns

Vector{Float64}: A vector of values representing the generalised logistic function evaluated at each time point in t.

Example

julia> t = collect(0.0:1.0:10.0);

julia> y = generalisedlogistic(t; A=0.0, K=10.0, C=1.0, Q=1.0, B=0.75, v=0.1);

julia> (y[1] == minimum(y)) && (y[end] == maximum(y))
true

source

CropGrowth.modelgrowth — Method

modelgrowth(; 
    y::Vector{Float64}, 
    t::Vector{Float64}, 
    θ_search_space::Dict{String, Dict{Symbol, Float64}}, 
    maxiters::Int64=10_000, 
    seed::Int64=42, 
    verbose::Bool=false
)::GrowthModel

Fit a generalised logistic growth model to the given data y over time t.

Arguments

y::Vector{Float64}: The observed data points representing the growth values.
t::Vector{Float64}: The corresponding time points for the observed data.
θ_search_space::Dict{String, Dict{Symbol, Float64}}: A dictionary defining the search space for each parameter of the generalised logistic model. Each key corresponds to a parameter name ("A", "K", "C", "Q", "B", "v"), and the value is another dictionary with keys :init, :lower, and :upper specifying the initial value, lower bound, and upper bound for that parameter.
maxiters::Int64=10_000: The maximum number of iterations for the optimisation algorithm. Defaults to 10,000.
seed::Int64=42: The random seed for reproducibility. Defaults to 42.
verbose::Bool=false: If true, prints the fitted parameters, fit statistics, and displays a plot of the fitted model. Defaults to false.

Returns

GrowthModel: A structure containing the fitted parameters of the generalised logistic growth model and fit statistics.

Details

The function uses an optimisation algorithm to minimise the mean squared error between the observed data y and the generalised logistic model.
The model parameters are constrained within bounds specified in the θ_search_space dictionary.
The optimisation is performed using the BBO_adaptive_de_rand_1_bin_radiuslimited() algorithm (details of the optimisation algorithm).
The function also computes fit statistics, which are included in the returned GrowthModel structure.
If verbose is set to true, the function prints the fitted parameters, fit statistics, and displays a scatter plot of the observed data along with the fitted curve.

Notes

y_max may be unexpectedly high if C is fitted freely which may indicate a simple linear growth rather than a logistic growth pattern.
If y_max is desired to be equal to K, set C = 1.0 when defining the θ_search_space, i.e. In θ_search_space define "C" => Dict(:init=>1.0, :lower=>1.0, :upper=>1.0).

Example

julia> t = collect(1.0:10.0);

julia> growth_model_1 = modelgrowth(t=t, y=sort(vcat([0.0, 0.0, 1.0, 1.0], rand(6))));

julia> growth_model_2 = modelgrowth(t=t, y=rand(10));

julia> growth_model_1.fit_statistics["R²"] > growth_model_2.fit_statistics["R²"]
true

source

CropGrowth.readdelimited — Method

readdelimited(; fname::String, delim::Union{Char,String} = "	", trait_name::String = "biomass")::DataFrame

Reads a delimited file into a DataFrame and validates its structure.

Arguments

fname::String: The path to the file to be read.
delim::Union{Char,String}: The delimiter used in the file. Defaults to tab (" ").
trait_name::String: The name of the trait column to validate. Defaults to "biomass".

Returns

A DataFrame containing the data from the file with the following required columns:
- entries
- sites
- replications
- growing_periods
- time_points
- The column specified by trait_name

Throws

ArgumentError: If the file does not exist.
ArgumentError: If any of the REQUIRED_COLUMNS = ["entries", "sites", "replications", "growing_periods", "time_points"], including trait_name, are missing.

Notes

The function expects the file to contain specific REQUIRED_COLUMNS = ["entries", "sites", "replications", "growing_periods", "time_points"], defined in REQUIRED_COLUMNS, along with the column specified by trait_name.

Example

julia> df = simulate();

julia> CSV.write("test.tsv", df, delim="	");

julia> df_read = readdelimited(fname="test.tsv", delim="	", trait_name="biomass");

julia> isequal(df, df_read)
true

source

CropGrowth.simulate — Method

simulate(; 
    n_entries::Int64 = 100, 
    n_sites::Int64 = 2, 
    n_replications::Int64 = 3, 
    n_growing_periods::Int64 = 4, 
    n_time_points_per_growing_period::Int64 = 10, 
    seed::Int64 = 42
)::DataFrame

Simulates crop growth data and returns it as a DataFrame. The function generates synthetic data for multiple entries, sites, replications, growing periods, and time points, with random biomass values.

Keyword Arguments

n_entries::Int64: Number of crop entries to simulate (default: 100).
n_sites::Int64: Number of sites to simulate (default: 2).
n_replications::Int64: Number of replications per site (default: 3).
n_growing_periods::Int64: Number of growing periods per replication (default: 4).
n_time_points_per_growing_period::Int64: Number of time points per growing period (default: 10).
seed::Int64: Random seed for reproducibility (default: 42).

Returns

A DataFrame with the following columns:

entries: Crop entry identifiers.
sites: Site identifiers.
replications: Replication identifiers.
growing_periods: Growing period identifiers.
time_points: Time points within each growing period.
biomass: Simulated biomass values.

Example

julia> df = simulate(n_entries=5, n_sites=1, n_replications=1, n_growing_periods=1, n_time_points_per_growing_period=5, trait_name="some_trait_name", seed=123);

julia> size(df)
(25, 6)

julia> var(df.some_trait_name) > 0.0
true

source

CropGrowth.timetomaxperc — Method

timetomaxperc(growth_model::GrowthModel; p::Vector{Float64} = [0.5])::Vector{Float64}

Calculate the time required for a growth model to reach a proportion p of its carrying capacity K.

Arguments

growth_model::GrowthModel: An instance of the GrowthModel type containing the parameters of the growth model.
p::Vector{Float64}: A vector of proportions (default is [0.5]) representing the fraction of the carrying capacity ($y_{max}$) to compute the time for.

Returns

Vector{Float64}: A vector of times corresponding to each proportion in p.

Details

The function computes the maximum value of the growth model ($y_{max} = A + (K-A)/C^(1/v)$) based on its parameters.
For each proportion in $p$, the corresponding value of $y$ is calculated as $p * y_{max}$.
The time required to reach each $y$ is computed using the generalised logistic growth model formula, which involves logarithmic and complex arithmetic operations.
The result is the real part of the computed times.
The formula is:

\[t = -\frac{1}{B} \cdot \ln \left( \frac{\left( \left( \frac{K - A}{y - A} \right)^v - C \right)}{Q} \right)\]

Notes

The function assumes that the GrowthModel instance contains the parameters A, K, C, Q, B, and v.
The computation may involve complex numbers, but only the real part of the result is returned.

Example

julia> growth_model = modelgrowth(t=collect(1.0:10.0), y=sort(vcat([0.0, 0.0, 1.0, 1.0], rand(6))));

julia> p = [0.5, 0.95, 0.99];

julia> time_to_Kp = timetomaxperc(growth_model; p=p);

julia> ŷ = generalisedlogistic(growth_model, t=time_to_Kp);

julia> all( abs.(ŷ .- (p .* growth_model.y_max)) .< 1e-5 )
true

source

CropGrowth.writedelimited — Method

writedelimited(df::DataFrame; fname::String, delim::Union{Char,String} = "	", overwrite::Bool = false)::Nothing

Writes the contents of a DataFrame to a delimited text file.

Arguments

df::DataFrame: The data frame to be written to the file. Must contain the required columns:
- entries
- sites
- replications
- growing_periods
- time_points
- 1 additional column representing the trait data
fname::String: The name of the output file, including its path if necessary.
delim::Union{Char,String}: The delimiter to use in the output file. Defaults to a tab character (" ").
overwrite::Bool: Whether to overwrite the file if it already exists. Defaults to false.

Returns

Nothing: This function does not return a value.

Example

julia> df = simulate();

julia> fname = writedelimited(df, fname="test.tsv", delim="	", overwrite=true);

julia> isfile(fname)
true

julia> df_read = readdelimited(fname="test.tsv", delim="	", trait_name="biomass");

julia> isequal(df, df_read)
true

source