CropGrowth.jl
CropGrowth.jl is a Julia package for modelling crop growth curves using the generalised logistic function:
Quickstart
using Pkg
Pkg.add(CropGrowth)
using CropGrowth
df = simulate()
df_out = fitgrowthmodels(df, verbose=true)Model
\[y(t) = {A + {{K-A} \over {C + (Qe^{-Bt})^{1/v}}}}\]
where:
- $y(t)$: biomass at time $t$
- $A$: lower asymptote (initial or minimum biomass)
- $K$: positively affects the upper asymptote. This is the final or maximum biomass if:
- $C = 1.00$, since:
- $y_{max} = A + (K-A)/C^{1/v}$, then
- $y_{max} = A + K - A$, therefore:
- $y_{max} = K$
- $C$: negatively affects the final or maximum biomass
- $Q$: negatively affects initial or minimum biomass
- $e$: Euler's number (~2.71828)
- $B$: growth rate
- $v$: asymmetry parameter ($v ≥ 0$; small values: fast growth early; large values: fast growth later)
To solve for $t$ at specific $y$:
\[t(y) = -{{1} \over {B}} \log {\left( { {{1} \over {Q}} \left( \left( {K - A} \over {y - A} \right)^v - C \right) } \right) }\]
Installation
Please install the latest version of Julia (we aim to update CropGrowth.jl for the latest Julia release):
curl -fsSL https://install.julialang.org | sh
type -a juliaInstall the CropGrowth.jl library in Julia:
- Stable version:
using Pkg
Pkg.add(CropGrowth)- Development version:
using Pkg
Pkg.add("https://github.com/jeffersonfparil/CropGrowth.jl")Test the installation in Julia:
using CropGrowth
df = simulate()
df_out = fitgrowthmodels(df, verbose=true)Implementation details
The main function, fitgrowthmodels fits generalized logistic growth models to the data provided in the input DataFrame and returns a DataFrame containing the fitted parameters, fit statistics, and time to reach specified fraction of the final value. The complete function signature is:
fitgrowthmodels(
df::DataFrame;
A = Dict(
:init=>minimum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
:lower=>0.0,
:upper=>maximum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
),
K = Dict(
:init=>maximum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
:lower=>0.0,
:upper=>2*maximum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
),
C = Dict(:init=>1.0, :lower=>1.0, :upper=>1.0),
Q = Dict(:init=>1.0, :lower=>1.0, :upper=>1.0),
B = Dict(:init=>1.0, :lower=>0.0, :upper=>10.0),
v = Dict(:init=>1.0, :lower=>1e-5, :upper=>10.0),
min_t::Int64 = 3,
frac_of_final::Vector{Float64} = [0.5, 0.9],
fit_statistic::String = "R²",
maxiters::Int64 = 10_000,
seed::Int64 = 42,
show_plots::Bool = false,
verbose::Bool = false,
)::Tuple{DataFrame, Vector{String}}Arguments
df::DataFrame: Input data containing the required columns specified inREQUIRED_COLUMNS = ["entries", "sites", "replications", "growing_periods", "time_points"]and at least one trait column.A::Dict: Search space for the parameterA(lower asymptote). Contains:init,:lower, and:upperkeys. Defaults to the minimum and maximum of the trait column withinit=minimum.K::Dict: Search space for the parameterK(upper asymptote). Contains:init,:lower, and:upperkeys. Defaults to the minimum and 2×maximum of the trait column withinit=maximum.C::Dict: Search space for the parameterC. Contains:init,:lower, and:upperkeys. Defaults toinit=1.0,lower=1.0,upper=1.0.Q::Dict: Search space for the parameterQ. Contains:init,:lower, and:upperkeys. Defaults toinit=1.0,lower=1.0,upper=1.0.B::Dict: Search space for the parameterB(growth rate). Contains:init,:lower, and:upperkeys. Defaults toinit=1.0,lower=0.0,upper=10.0.v::Dict: Search space for the parameterv(asymmetry parameter). Contains:init,:lower, and:upperkeys. Defaults toinit=1.0,lower=1e-5,upper=10.0.min_t::Int64: Minimum number of time points required to fit the growth model for a specific combination of entry, site, replication, and growing period. Defaults to3.frac_of_final::Vector{Float64}: Percentages of the final value for which the time to reach these fraction will be calculated. Defaults to[0.5, 0.9].fit_statistic::String: The fit statistic to be used for evaluating the model. Must be one of["R²", "RMSE", "MSE", "MAE", "ρ"]. Defaults to"R²".maxiters::Int64: Maximum number of iterations allowed for the optimization process. Defaults to10_000.seed::Int64: Random seed for reproducibility. Defaults to42.show_plots::Bool: Whether to show fitted growth curve plots. Defaults tofalse.verbose::Bool: Whether to display progress and additional information during the fitting process. Defaults tofalse.
Returns
Tuple{DataFrame, Vector{String}}:
- The first element is a
DataFramecontaining the fitted parameters (A,K,C,Q,B,v), fit statistics, value of the growth models at $t=0$ (y_t0), maximum value of the growth model (y_max), and time to reach specified fraction of the final value for each combination of entry, site, replication, and growing period. - The second element is a
Vector{String}containing the combinations that were skipped due to insufficient data points.
Notes
- The input
DataFramemust contain the required columns specified in the global variableREQUIRED_COLUMNS = ["entries", "sites", "replications", "growing_periods", "time_points"], as well as at least one additional trait column. - If the
DataFramecontains more than one trait column, only the first trait column will be used. - Combinations with fewer than
min_ttime points will be skipped. - The function uses a progress bar to indicate the fitting process if
verbose=true. - The optimisation is performed using the
BBO_adaptive_de_rand_1_bin_radiuslimited()algorithm (details of the optimisation algorithm). - The optimisation algorithm minimises the mean squared error between the observed data
yand the generalised logistic model.
Examples
using CropGrowth, StatsBase
df = simulate(n_entries=5, n_sites=1, n_replications=1, n_growing_periods=1, n_time_points_per_growing_period=5, seed=123)
df_out_0, skipped_combinations_0 = fitgrowthmodels(df, show_plots=true)
df_out_C, skipped_combinations_C = fitgrowthmodels(df, C=Dict(:init=>1.0, :lower=>0.0, :upper=>100.0), show_plots=true)
df_out_Q, skipped_combinations_Q = fitgrowthmodels(df, Q=Dict(:init=>1.0, :lower=>0.0, :upper=>100.0), show_plots=true)
df_out_CQ, skipped_combinations_CQ = fitgrowthmodels(
df,
C=Dict(:init=>1.0, :lower=>0.0, :upper=>5.0),
Q=Dict(:init=>1.0, :lower=>0.0, :upper=>5.0),
show_plots=true
)
mean(df_out_0.R²)
mean(df_out_C.R²)
mean(df_out_Q.R²)
mean(df_out_CQ.R²)Comparison with nplr
TLDR: CropGrowth.jl fits a more general logistic function than nplr.
The R package nplr: n-parameter logistic regressions fits a variant of the logistic model. It fits 5 parameters instead of the 6 parameters in CropGrowth.jl's formulation above. The model is defined as:
\[y(x) = { B + { {T - B} \over {\left [ 1 + 10^{b(x_{mid} - x)} \right ]^s} } }\]
where:
- $B$: lower asymptote
- $T$: upper asymptote
- $b$: Hill slope (related to
Bor growth rate) - $x_{mid}$: x-coordinate at the inflexion point,
- $s$ asymetry coefficient (related to
v)
Fitting log10(dose)-response curve with nplr:
library(nplr)
library(txtplot)
path <- system.file("extdata", "pc3.txt", package="nplr")
pc3 <- read.delim(path)
txtplot::txtplot(log10(pc3$CONC), pc3$GIPROP) # nplr default does log10(x)
np1 <- nplr::nplr(x=pc3$CONC, y=pc3$GIPROP, useLog=TRUE)
y = pc3$GIPROP
y_hat = np1@yFit
print("Fitted parameters:")
print(paste0("K = ", np1@pars$bottom))
print(paste0("A = ", np1@pars$top))
print(paste0("xmid = ", np1@pars$xmid))
print(paste0("B_ish = ", np1@pars$scal))
print(paste0("v_ish = ", np1@pars$scal))
print("Fit statistics:")
print(paste0("mae = ", mean(abs(y - y_hat))))
print(paste0("mse = ", mean((y - y_hat)^2)))
print(paste0("rmse = ", sqrt(mean((y - y_hat)^2))))
print(paste0("cor = ", cor(y, y_hat)))
print(paste0("r2 = ", 1 - (var((y - y_hat)) / var(y))))
# [1] "Fitted parameters:"
# [1] "K = 0.000182956669824204"
# [1] "A = 0.996490622312222"
# [1] "xmid = -6.18254538738321"
# [1] "B_ish = -1.42800873142546"
# [1] "v_ish = -1.42800873142546"
# [1] "Fit statistics:"
# [1] "mae = 0.0186073781104083"
# [1] "mse = 0.000651821232462819"
# [1] "rmse = 0.0255307898910868"
# [1] "cor = 0.997250250732751"
# [1] "r2 = 0.994444954867448"CropGrowth.jl achieves very similar but more generalised fit:
using DataFrames, CSV, CropGrowth
df = CSV.read(expanduser("~/.conda/envs/GenomicBreeding/lib/R/library/nplr/extdata/pc3.txt"), DataFrame)
growth_model = modelgrowth(
t = log10.(df.CONC),
y = df.GIPROP,
maxiters = 100_000,
seed = 42,
verbose = true,
)
# Fitted parameters:
# A = 0.991559343580949
# K = 0.0380737475678395
# C = 1.0
# Q = 1.5219825671137292e-5
# B = 1.5916146320430173
# v = 0.15864685998073122
# y_t0 = 0.03816521536021922
# y_max = 0.038073747568793115
# Fit statistics:
# mse = 0.0006662392521457019
# rmse = 0.0258116108010659
# R² = 0.9943191952067459
# ρ = 0.9971555521616207
# mae = 0.020947684621310015API
CropGrowth.GrowthModelCropGrowth.fitgrowthmodelsCropGrowth.fitstatisticsCropGrowth.generalisedlogisticCropGrowth.generalisedlogisticCropGrowth.modelgrowthCropGrowth.readdelimitedCropGrowth.simulateCropGrowth.timetomaxpercCropGrowth.writedelimited
CropGrowth.GrowthModel — Type
mutable struct GrowthModel
A::Float64
K::Float64
C::Float64
Q::Float64
B::Float64
v::Float64
y_t0::Float64
y_max::Float64
fit_statistics::Dict{String,Float64}Holds parameters for the generalised logistic growth model:
\[y(t) = A + \frac{K-A}{C + (Qe^{-Bt})^{1/v}}\]
where:
- $y(t)$: biomass at time $t$ (not part of the struct)
- $A$: lower asymptote (initial or minimum biomass)
- $K$: positively affects the upper asymptote (can be the final or maximum biomass if $C = 1.00$, since $y_{max} = A + (K-A)/C^(1/v)$, then $y_{max} = A + K - A$, therefore: $y_{max} = K$)
- $C$: negatively affects the final or maximum biomass
- $Q$: negatively affects initial or minimum biomass
- $e$: Euler's number (~2.71828)
- $B$: growth rate
- $v$: asymmetry parameter ($v ≥ 0$; small values: fast growth early; large values: fast growth later)
Additional information are provided in the struct:
y_t0: value of the growth model at time $t = 0$y_max: maximum value of the growth model ($y_{max} = A + (K-A)/C^(1/v)$)fit_statistics: a dictionary containing fit statistics such as R², RMSE, MSE, MAE, and Pearson's correlation coefficient (ρ)
Constructor
GrowthModel(;
A::Float64,
K::Float64,
C::Float64,
Q::Float64,
B::Float64,
v::Float64,
fit_statistics::Dict{String,Float64} = Dict(""=>NaN),
ϵ::Float64 = 1e-12,
)Notes:
- The struct includes a constructor that automatically calculates
y_maxbased on the provided parameters. - The
fit_statisticsfield is optional and defaults to a dictionary with a single entry of an empty string key andNaNvalue. - A small epsilon value (
ϵ) is added to the calculation ofy_maxto prevent division by zero errors.
Example
julia> growth_model = GrowthModel(A=0.0, K=10.0, C=1.0, Q=1.0, B=0.75, v=0.1);
julia> (growth_model.y_max - growth_model.K) < 1e-9
true
julia> (growth_model.y_max - 10.0) < 1e-9
trueCropGrowth.fitgrowthmodels — Method
fitgrowthmodels(
df::DataFrame;
A = Dict(
:init=>minimum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
:lower=>0.0,
:upper=>maximum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
),
K = Dict(
:init=>maximum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
:lower=>0.0,
:upper=>2*maximum(select(df, Not(REQUIRED_COLUMNS))[:, 1]),
),
C = Dict(:init=>1.0, :lower=>1.0, :upper=>1.0),
Q = Dict(:init=>1.0, :lower=>1.0, :upper=>1.0),
B = Dict(:init=>1.0, :lower=>0.0, :upper=>10.0),
v = Dict(:init=>1.0, :lower=>1e-5, :upper=>10.0),
min_t::Int64 = 3,
frac_of_final::Vector{Float64} = [0.5, 0.9],
fit_statistic::String = "R²",
maxiters::Int64 = 10_000,
seed::Int64 = 42,
show_plots::Bool = false,
verbose::Bool = false,
)::Tuple{DataFrame, Vector{String}}Fits generalised logistic growth models to the data provided in the input DataFrame and returns a DataFrame containing the fitted parameters, fit statistics, and time to reach specified fractions of the final value.
Arguments
df::DataFrame: Input data containing the required columns specified inREQUIRED_COLUMNS = ["entries", "sites", "replications", "growing_periods", "time_points"]and at least one trait column.A::Dict: Search space for the parameterA(lower asymptote). Contains:init,:lower, and:upperkeys. Defaults to the minimum and maximum of the trait column withinit=minimum.K::Dict: Search space for the parameterK(upper asymptote). Contains:init,:lower, and:upperkeys. Defaults to the minimum and 2×maximum of the trait column withinit=maximum.C::Dict: Search space for the parameterC. Contains:init,:lower, and:upperkeys. Defaults toinit=1.0,lower=1.0,upper=1.0.Q::Dict: Search space for the parameterQ. Contains:init,:lower, and:upperkeys. Defaults toinit=1.0,lower=1.0,upper=1.0.B::Dict: Search space for the parameterB(growth rate). Contains:init,:lower, and:upperkeys. Defaults toinit=1.0,lower=0.0,upper=10.0.v::Dict: Search space for the parameterv(asymmetry parameter). Contains:init,:lower, and:upperkeys. Defaults toinit=1.0,lower=1e-5,upper=10.0.min_t::Int64: Minimum number of time points required to fit the growth model for a specific combination of entry, site, replication, and growing period. Defaults to3.frac_of_final::Vector{Float64}: Percentages of the final value for which the time to reach these fractions will be calculated. Defaults to[0.5, 0.9].fit_statistic::String: The fit statistic to be used for evaluating the model. Must be one of["R²", "RMSE", "MSE", "MAE", "ρ"]. Defaults to"R²".maxiters::Int64: Maximum number of iterations allowed for the optimisation process. Defaults to10_000.seed::Int64: Random seed for reproducibility. Defaults to42.show_plots::Bool: Whether to show fitted growth curve plots. Defaults tofalse.verbose::Bool: Whether to display progress and additional information during the fitting process. Defaults tofalse.
Returns
Tuple{DataFrame, Vector{String}}:- The first element is a
DataFramecontaining the fitted parameters (A,K,C,Q,B,v), fit statistics, value of the growth models at $t=0$ (y_t0), maximum value of the growth model (y_max), time to reach specified fractions of the final value (time_to_*p), and number of time-points (number_of_time_points) for each combination of entry, site, replication, and growing period. - The second element is a
Vector{String}containing the combinations that were skipped due to insufficient data points.
- The first element is a
Notes
- The input
DataFramemust contain the required columns specified in the global variableREQUIRED_COLUMNS = ["entries", "sites", "replications", "growing_periods", "time_points"], as well as at least one additional trait column. - If the
DataFramecontains more than one trait column, only the first trait column will be used. - Combinations with fewer than
min_ttime points will be skipped. - The function uses a progress bar to indicate the fitting process if
verbose=true. - The optimisation is performed using the
BBO_adaptive_de_rand_1_bin_radiuslimited()algorithm (details of the optimisation algorithm). - The optimisation algorithm minimises the mean squared error between the observed data
yand the generalised logistic model. - Parallel curve fitting:
- The per-combination fits are executed in parallel using
Threads.@threadsover the prepared list of filter combinations. To take advantage of this, start Julia with multiple threads (for examplejulia -t 4or setJULIA_NUM_THREADS). - Access to the shared
fitted_parametersdictionary is protected with aReentrantLockand@lockto ensure thread-safety when appending results. This avoids race conditions but introduces some serialization for the final writes.
- The per-combination fits are executed in parallel using
Details
Fits a generalised logistic growth model to the data using the following equation:
\[y(t) = A + \frac{K-A}{C + (Qe^{-Bt})^{1/v}}\]
where:
- $y(t)$: biomass at time $t$
- $A$: lower asymptote (initial or minimum biomass)
- $K$: positively affects the upper asymptote (can be the final or maximum biomass if $C = 1.00$, since $y_{max} = A + (K-A)/C^(1/v)$, then $y_{max} = A + K - A$, therefore: $y_{max} = K$)
- $C$: negatively affects the final or maximum biomass
- $Q$: negatively affects initial or minimum biomass
- $e$: Euler's number (~2.71828)
- $B$: growth rate
- $v$: asymmetry parameter ($v ≥ 0$; small values: fast growth early; large values: fast growth later)
Additional information are provided in the output DataFrame:
y_t0is the value of the growth model at time $t = 0$y_maxis the maximum value of the growth model ($y_{max} = A + (K-A)/C^(1/v)$)- One of the following fit statistic such as R² (default), RMSE, MSE, MAE, and Pearson's correlation coefficient (ρ)
Example
julia> df = simulate(n_entries=5, seed=42);
julia> df_out, skipped_combinations = fitgrowthmodels(df);
julia> length(unique(df.entries)) == length(unique(df_out.entries))
true
julia> length(unique(df.sites)) == length(unique(df_out.sites))
true
julia> length(unique(df.replications)) == length(unique(df_out.replications))
true
julia> length(unique(df.growing_periods)) == length(unique(df_out.growing_periods))
true
julia> all(x -> x ∈ names(df_out), ["A", "K", "C", "Q", "B", "v", "y_max", "R²", "time_to_50p", "time_to_90p"])
true
julia> (var(df_out.A) > 0.0) && (var(df_out.K) > 0.0) && (var(df_out.C) == 0.0) && (var(df_out.Q) == 0.0) && (var(df_out.B) > 0.0) && (var(df_out.v) > 0.0)
true
julia> (var(df_out.y_max) > 0.0) && (var(df_out."R²") > 0.0) && (var(df_out.time_to_50p) > 0.0) && (var(df_out.time_to_90p) > 0.0)
trueCropGrowth.fitstatistics — Method
fitstatistics(; y::Vector{Float64}, ŷ::Vector{Float64})::Dict{String, Float64}Compute various fit statistics for a given set of observed and predicted values.
Arguments
y::Vector{Float64}: Observed data points.ŷ::Vector{Float64}: Predicted data points.
Returns
A dictionary containing the following fit statistics:
"R²": Coefficient of determination, a measure of how well the predicted values explain the variance in the observed data."rmse": Root mean squared error, a measure of the average magnitude of the residuals."mse": Mean squared error, the average of the squared residuals."mae": Mean absolute error, the average of the absolute residuals."ρ": Pearson correlation coefficient between the observed and predicted values.
Notes
- Ensure that the
yandŷvectors are of the same length and correspond to the same observations. - The function assumes that both
yandŷare non-empty vectors.
Example
julia> y1::Vector{Float64} = rand(10); y2::Vector{Float64} = rand(10);
julia> stats0 = fitstatistics(y=y1, ŷ=y1);
julia> stats1 = fitstatistics(y=y1, ŷ=y2);
julia> stats0["R²"] > stats1["R²"]
true
julia> stats0["rmse"] < stats1["rmse"]
true
julia> stats0["mse"] < stats1["mse"]
true
julia> stats0["mae"] < stats1["mae"]
true
julia> stats0["ρ"] > stats1["ρ"]
trueCropGrowth.generalisedlogistic — Method
generalisedlogistic(growth_model::GrowthModel; t::Vector{Float64})::Vector{Float64}Computes the generalised logistic function for a given vector of time points t using the parameters defined in the GrowthModel.
Arguments
growth_model::GrowthModel: A struct containing the parameters for the generalised logistic function.t::Vector{Float64}: A vector of time points at which to evaluate the function.
Returns
Vector{Float64}: A vector of values representing the generalised logistic function evaluated at each time point int.
Returns
Vector{Float64}: A vector of values representing the generalised logistic function evaluated at each time point int.
Example
julia> t = collect(0.0:1.0:10.0);
julia> growth_model = GrowthModel(A=0.0, K=10.0, C=1.0, Q=1.0, B=0.75, v=0.1);
julia> y = generalisedlogistic(growth_model, t=t);
julia> (y[1] == minimum(y)) && (y[end] == maximum(y))
trueCropGrowth.generalisedlogistic — Method
generalisedlogistic(t::Vector{Float64}; A=0.0, K=1.0, C=1.0, Q=1.0, B=1.0, v=1.0)::Vector{Float64}Computes the generalised logistic function for a given vector of time points t.
Arguments
t::Vector{Float64}: A vector of time points at which to evaluate the function.
Keyword Arguments
A::Float64(default:0.0): lower asymptote (initial or minimum biomass)K::Float64(default:1.0): upper asymptote (can be the final or maximum biomass if $C = 1.00$, since $y_{max} = A + (K-A)/C^(1/v)$, then $y_{max} = A + K - A$, therefore: $y_{max} = K$)C::Float64(default:1.0): negatively affects the final or maximum biomassQ::Float64(default:1.0): negatively affects initial or minimum biomassB::Float64(default:1.0): growth ratev::Float64(default:1.0): asymmetry parameter ($v ≥ 0$; small values: fast growth early; large values: fast growth later)
Returns
Vector{Float64}: A vector of values representing the generalised logistic function evaluated at each time point int.
Example
julia> t = collect(0.0:1.0:10.0);
julia> y = generalisedlogistic(t; A=0.0, K=10.0, C=1.0, Q=1.0, B=0.75, v=0.1);
julia> (y[1] == minimum(y)) && (y[end] == maximum(y))
trueCropGrowth.modelgrowth — Method
modelgrowth(;
y::Vector{Float64},
t::Vector{Float64},
θ_search_space::Dict{String, Dict{Symbol, Float64}},
maxiters::Int64=10_000,
seed::Int64=42,
verbose::Bool=false
)::GrowthModelFit a generalised logistic growth model to the given data y over time t.
Arguments
y::Vector{Float64}: The observed data points representing the growth values.t::Vector{Float64}: The corresponding time points for the observed data.θ_search_space::Dict{String, Dict{Symbol, Float64}}: A dictionary defining the search space for each parameter of the generalised logistic model. Each key corresponds to a parameter name ("A","K","C","Q","B","v"), and the value is another dictionary with keys:init,:lower, and:upperspecifying the initial value, lower bound, and upper bound for that parameter.maxiters::Int64=10_000: The maximum number of iterations for the optimisation algorithm. Defaults to 10,000.seed::Int64=42: The random seed for reproducibility. Defaults to 42.verbose::Bool=false: Iftrue, prints the fitted parameters, fit statistics, and displays a plot of the fitted model. Defaults tofalse.
Returns
GrowthModel: A structure containing the fitted parameters of the generalised logistic growth model and fit statistics.
Details
- The function uses an optimisation algorithm to minimise the mean squared error between the observed data
yand the generalised logistic model. - The model parameters are constrained within bounds specified in the
θ_search_spacedictionary. - The optimisation is performed using the
BBO_adaptive_de_rand_1_bin_radiuslimited()algorithm (details of the optimisation algorithm). - The function also computes fit statistics, which are included in the returned
GrowthModelstructure. - If
verboseis set totrue, the function prints the fitted parameters, fit statistics, and displays a scatter plot of the observed data along with the fitted curve.
Notes
y_maxmay be unexpectedly high ifCis fitted freely which may indicate a simple linear growth rather than a logistic growth pattern.- If
y_maxis desired to be equal toK, setC = 1.0when defining theθ_search_space, i.e. Inθ_search_spacedefine"C" => Dict(:init=>1.0, :lower=>1.0, :upper=>1.0).
Example
julia> t = collect(1.0:10.0);
julia> growth_model_1 = modelgrowth(t=t, y=sort(vcat([0.0, 0.0, 1.0, 1.0], rand(6))));
julia> growth_model_2 = modelgrowth(t=t, y=rand(10));
julia> growth_model_1.fit_statistics["R²"] > growth_model_2.fit_statistics["R²"]
trueCropGrowth.readdelimited — Method
readdelimited(; fname::String, delim::Union{Char,String} = " ", trait_name::String = "biomass")::DataFrameReads a delimited file into a DataFrame and validates its structure.
Arguments
fname::String: The path to the file to be read.delim::Union{Char,String}: The delimiter used in the file. Defaults to tab (" ").trait_name::String: The name of the trait column to validate. Defaults to"biomass".
Returns
- A
DataFramecontaining the data from the file with the following required columns:entriessitesreplicationsgrowing_periodstime_points- The column specified by
trait_name
Throws
ArgumentError: If the file does not exist.ArgumentError: If any of theREQUIRED_COLUMNS = ["entries", "sites", "replications", "growing_periods", "time_points"], includingtrait_name, are missing.
Notes
- The function expects the file to contain specific
REQUIRED_COLUMNS = ["entries", "sites", "replications", "growing_periods", "time_points"], defined inREQUIRED_COLUMNS, along with the column specified bytrait_name.
Example
julia> df = simulate();
julia> CSV.write("test.tsv", df, delim=" ");
julia> df_read = readdelimited(fname="test.tsv", delim=" ", trait_name="biomass");
julia> isequal(df, df_read)
trueCropGrowth.simulate — Method
simulate(;
n_entries::Int64 = 100,
n_sites::Int64 = 2,
n_replications::Int64 = 3,
n_growing_periods::Int64 = 4,
n_time_points_per_growing_period::Int64 = 10,
seed::Int64 = 42
)::DataFrameSimulates crop growth data and returns it as a DataFrame. The function generates synthetic data for multiple entries, sites, replications, growing periods, and time points, with random biomass values.
Keyword Arguments
n_entries::Int64: Number of crop entries to simulate (default: 100).n_sites::Int64: Number of sites to simulate (default: 2).n_replications::Int64: Number of replications per site (default: 3).n_growing_periods::Int64: Number of growing periods per replication (default: 4).n_time_points_per_growing_period::Int64: Number of time points per growing period (default: 10).seed::Int64: Random seed for reproducibility (default: 42).
Returns
A DataFrame with the following columns:
entries: Crop entry identifiers.sites: Site identifiers.replications: Replication identifiers.growing_periods: Growing period identifiers.time_points: Time points within each growing period.biomass: Simulated biomass values.
Example
julia> df = simulate(n_entries=5, n_sites=1, n_replications=1, n_growing_periods=1, n_time_points_per_growing_period=5, trait_name="some_trait_name", seed=123);
julia> size(df)
(25, 6)
julia> var(df.some_trait_name) > 0.0
trueCropGrowth.timetomaxperc — Method
timetomaxperc(growth_model::GrowthModel; p::Vector{Float64} = [0.5])::Vector{Float64}Calculate the time required for a growth model to reach a proportion p of its carrying capacity K.
Arguments
growth_model::GrowthModel: An instance of theGrowthModeltype containing the parameters of the growth model.p::Vector{Float64}: A vector of proportions (default is[0.5]) representing the fraction of the carrying capacity ($y_{max}$) to compute the time for.
Returns
Vector{Float64}: A vector of times corresponding to each proportion inp.
Details
- The function computes the maximum value of the growth model ($y_{max} = A + (K-A)/C^(1/v)$) based on its parameters.
- For each proportion in $p$, the corresponding value of $y$ is calculated as $p * y_{max}$.
- The time required to reach each $y$ is computed using the generalised logistic growth model formula, which involves logarithmic and complex arithmetic operations.
- The result is the real part of the computed times.
- The formula is:
\[t = -\frac{1}{B} \cdot \ln \left( \frac{\left( \left( \frac{K - A}{y - A} \right)^v - C \right)}{Q} \right)\]
Notes
- The function assumes that the
GrowthModelinstance contains the parametersA,K,C,Q,B, andv. - The computation may involve complex numbers, but only the real part of the result is returned.
Example
julia> growth_model = modelgrowth(t=collect(1.0:10.0), y=sort(vcat([0.0, 0.0, 1.0, 1.0], rand(6))));
julia> p = [0.5, 0.95, 0.99];
julia> time_to_Kp = timetomaxperc(growth_model; p=p);
julia> ŷ = generalisedlogistic(growth_model, t=time_to_Kp);
julia> all( abs.(ŷ .- (p .* growth_model.y_max)) .< 1e-5 )
trueCropGrowth.writedelimited — Method
writedelimited(df::DataFrame; fname::String, delim::Union{Char,String} = " ", overwrite::Bool = false)::NothingWrites the contents of a DataFrame to a delimited text file.
Arguments
df::DataFrame: The data frame to be written to the file. Must contain the required columns:entriessitesreplicationsgrowing_periodstime_points- 1 additional column representing the trait data
fname::String: The name of the output file, including its path if necessary.delim::Union{Char,String}: The delimiter to use in the output file. Defaults to a tab character (" ").overwrite::Bool: Whether to overwrite the file if it already exists. Defaults tofalse.
Returns
Nothing: This function does not return a value.
Example
julia> df = simulate();
julia> fname = writedelimited(df, fname="test.tsv", delim=" ", overwrite=true);
julia> isfile(fname)
true
julia> df_read = readdelimited(fname="test.tsv", delim=" ", trait_name="biomass");
julia> isequal(df, df_read)
true