This page provide the R code and supplemental information regarding the publication Rapid tree carbon recovery in Amazon managed forests published in Current Biology on 21 September 2015 editor’s link
The code below enables to replicate the analysis, produce all graphs and summary tables. A supplementary analysis is also carried out and provided at the end.
Summary data are freely available here Raw data are available upon request: contact@tmfo.org
Ten (10) sites spread across the Amazon basin and the Guiana Shield were selected based on the following criteria: (i) located in tropical forests with a total area inventoried ≥ 1 ha; (ii) mean annual rainfall ≥ 1000 mm (Fig. S1); (iii) consistent and detailed information about logging treatments (e.g. number of stems harvested and correspondent biomass removal) and logging impacts (e.g. logging damages assessment); (iv) at least one pre-logging and (v) at least two post-logging censuses. As sites were generally established by different organizations, there is no standardized protocol for data collection among sites, but all sites comply with generally agreed standards [1]. A general description of the sites can be found in [2]. In all plots, trees ≥ 20 cm DBH (diameter at breast height) had their girth measured at 130 cm or above buttresses/deformations, and were tagged and identified to the lowest taxonomical level.
To avoid bias due to discrepancies in data quality (e.g. difference in botanical identification or tree species wood density information), a standardized protocol was applied to each site. At first, botanical identification was checked to match the Global Wood Density Database (GWDD) classification [3]. Tree species present in GWDD were assigned correspondent dry wood density (WD, gr.cm-3). When only the genus was present, genus-average WD was assigned and for unidentified species and species not present in the GWDD, plot-average WD was attributed. In the absence of tree height measurements, tree above-ground biomass (AGB) was estimated using the generic allometric model developed by Chave et al. [4] and including WD, DBH and a synthetic climatic index (E).
Above-ground carbon density (ACS) was obtained by multiplying tree biomass by 0.47 [5]. ACS stock of each plot was further computed as the sum of ACS of live trees DBH ≥ 20 cm divided by the plot surface and expressed in Mg C ha-1.
The same definition of logging intensity was applied at all sites. Due to varying interval length (1 to 4 years) between pre- and post-logging censuses and application of silvicultural treatments (i.e. poisoning, girdling, understorey clearing) at three sites (Paracou, Tapajos and la Chonta), we estimated the minimum carbon stock (ACSmin, Fig. S2) attained at last within 4 years after logging and computed the difference with initial carbon stock (ACS0). This initial ACS drop off, referred to as ACSloss, is due to both timber harvest and mortality of damaged trees (that can affect up to 46% of remaining trees[6]). As residual mortality peaks within the first years preceding logging [7, 8], this approach allows most of logging-induced mortality to be accounted.
We found no evidence of strong deviation from linearity of ACS recovery (Figure S3); therefore, we estimated the annualized ACS recovery rates (Mg C ha-1 yr-1) per plot using linear models among all post-logging censuses spreading between tmin and tfinal. Recovery time (trec in years) refers to the estimated time needed to recover initial ACS stock, and is given by the product of initial ACS loss divided by average recovery rate.
Several explanatory variables were calculated at each site: (1) average pre-logging ACS stock (ACS0 in Mg C ha-1); (2) Basal Area-weighted wood density (or community wood density, WDBA in g.cm-3); (3) stem density (ha-1); (4) average annual rainfall (mm yr-1) that arose from local weather stations; (5) rainfall seasonality (annual standard deviation) were extracted at each site using WorldClim data (Hijmans et al., 2005) using highest resolution (30 arc-seconds or ~1 km). Due to lack of information at all sites, soil properties were extracted from the Harmonized World Soil raster at a resolution of 30 arc-seconds (9). Information on top soil (0-30 cm) quality was extracted at each site: texture, drainage, available water content (range), clay, silt and sand content (%), cation-exchange capacity (CEC, cmol/kg) and bulk density (kg/dm3).
To test for possible circularity between the synthetic climatic index (E) used to compute ACS and the climatic explanatory variables, all analysis were recomputed with another generic allometric model [9], based on local WD and DBH only. All pattern and variables significance remained unchanged (data not shown).
To ensure that observed biomass recovery was mainly related to logging and avoid bias due to stochastic natural mortality (e.g. the 2005 drought and fires), we selected only plots (79 out of 90) with positive recovery rates (e.g. that gain biomass/carbon over the monitored period), as a detailed checking revealed that those 11 plots suffered from wildfires and droughts. As our sample plots and sites vary in both total area and length of time monitored for, the contribution of each site was weighted by the monitoring effort (number of censuses x plot size), as recommended by [14]. Hence, sites with longer and larger monitoring (more prone to capture and depict forest recovery) are given more weight. To avoid artificial inflation of the variance of random effects, the sum of weights was set to 1. Table S1 provides information on initial and final ACS, ACS loss, recovery rate and recovery time for each plot (N=90).
Our main point was to understand generic drivers that led recovery time and recovery rate among all sites. We developed a linear mixed model (LMM, package lme4 [15]) in which recovery time and rate were tested over the different biometric response variables defined above. To account for the site effect, we introduced a random site effect. Indeed, most sites are constituted of several contiguous plots in which silviculture treatments (e.g. logging, girdling or understorey clearing) of varying intensities were applied. Such experimental design ensures a relative homogeneity in environmental conditions and forest structure, but might also induce pseudo-replication. Pseudo-replication occurs when multiple samples from a single treatment unit are analyzed, as if they were independent replicates and embed to distinguished the effect due to treatment from other sources of variation [16]. To avoid this bias, a “site-effect” was introduced in the LMM and pre-logging forest structures were accounted for as explanatory variables in the analysis.
The best models are found through conducting an exhaustive screening and ranking using Bayesian Information Criterion (BIC) (package lmerTest [17]). Instead of picking a single “best” model, we averaged the fits of a number of “good” models (model averaging) based on Bayesian Information Criterion (BIC) weights, thereby stressing prediction over precision [18]. Very good fits were effectively found at each site (Figure S3). To reduce residual heteroscedasticity, recovery time along with two explanatory variables (ACS logged and number of trees harvested) was log-transformed.
All analyses were carried out with R language and environment [19].
# Creating the table summarizing predictors per plot
# Table to store results
RES <- data.frame(site=rep(NA,98),plot=NA,trait=NA,plot.size=NA,AGB.t0=NA,AGB.t1=NA,AGB.min=NA,AGB.final=NA,Tmin=NA,Tfinal=NA,N.census=NA,rain=NA,wsgBA.t0=NA,treeHarvested.t1=NA,volumeHarvested.t1=NA,AGB.logged.t1=NA,AGB.logged.t1_per=NA,AGB.lost=NA,AGB.lost_per=NA,AGB.lost.residual=NA,AGB.lost.residual_per=NA,rec.rate=NA,rec.time=NA) # table of results
a=0 # value of the counter (required below)
SITE <- c("Ecosilva", "La Chonta","Paragominas", "Tabocal", "Tortue", "Itacoatiara","Paracou", "Tapajos", "Jari", "ChicoBocao")
# Loop over site and plot
for (i in 1:length(unique(SITE))) {
temp <- TmFO[TmFO$site==SITE[i] & TmFO$time.since.logging>0,] # census after logging; one keep only the plot that were not affected by fire and are not control
temp1 <- TmFO[TmFO$site==SITE[i] & TmFO$time.since.logging<=0,] # census before logging
PLOT2 <- levels(factor(temp$plot.code))
for (j in 1:length(PLOT2)) {
a = a+1
temp2 <- temp[temp$plot.code==PLOT2[j],]
temp2.1 <- temp1[temp1$plot.code==PLOT2[j],]
AGBaft <- tapply(temp2$AGBt.20,temp2$time.since.logging,mean) # biomass per census AFTER logging
T1 <- as.numeric(names(AGBaft[1]))
T2 <- as.numeric(names(AGBaft[2]))
AGBmin <- AGBaft[which(as.numeric(names(AGBaft))-T1 <= 4)] # AGB within a 4 years window after logging
AGBmin2 <- AGBaft[which(as.numeric(names(AGBaft))-T1 < 7)] # AGB within a 7 years window after logging
RES[a,1] <- unique(as.character(temp2$site)) # return the site
RES[a,2] <- PLOT2[j] # return the plot ID
RES[a,3] <- unique(as.character(temp2$trait)) # return the treatment
RES[a,4] <- unique(temp2$plot.size) # return the plot size (ha)
AGBini <- mean(temp2.1$AGBt.20)
RES[a,5] <- AGBini # return the initial AGB (t0)
RES[a,6] <- AGBaft[1] # AGB remaining directly after logging
RES[a,7] <- min(AGBmin) # minimum AGB observed
RES[a,8] <- AGBaft[length(AGBaft)] # final AGB value
Tmin <- as.numeric(names(AGBmin[which(AGBmin==min(AGBmin))]))
Tmin2 <- as.numeric(names(AGBaft[which(AGBaft==min(AGBaft))]))
RES[a,9] <- Tmin # time min AGB value after logging
Tfinal <- as.numeric(names(AGBaft)[length(AGBaft)])
RES[a,10] <- Tfinal # time final census
RES[a,11] <- length(AGBaft) # number of censuses (to define plots weight)
RES[a,12] <- unique(as.numeric(temp2$rainfall)) # return the rainfall
RES[a,13] <- mean(temp2.1$wsgBA) # initial wsg before logging
RES[a,14] <- mean(temp2$trees.harvested.ha) # density of trees logged
RES[a,15] <- mean(temp2$volum.harvested,na.rm=T) # density of trees logged
RES[a,16] <- ifelse(is.na(mean(temp2.1$AGBt.20)),NA,(AGBaft[1] - AGBini)) # AGB logged (!negative values!) (t1)
RES[a,17] <- ifelse(is.na(mean(temp2.1$AGBt.20)),NA,(AGBaft[1] - AGBini )*100/AGBini) # AGB logged (t1) expressed as percentage of initial biomass (t0)
RES[a,18] <- ifelse(is.na(mean(temp2.1$AGBt.20)),NA,min(AGBmin) - AGBini) # maximum AGB lost after logging (!negative values!)
RES[a,19] <- ifelse(is.na(mean(temp2.1$AGBt.20)),NA,(min(AGBmin) - AGBini)*100/AGBini) #maximum AGB lost after logging as % of initial biomass (t0)
RES[a,20] <- min(AGBmin) - AGBaft[1] # residual mortality after logging
RES[a,21] <- (min(AGBmin) - AGBaft[1])*100/AGBaft[1] # residual mortality after logging (%)
agb <- temp2[temp2$time.since.logging%in%c(Tmin:Tfinal),]$AGBt.20-min(AGBmin)
time <- temp2[temp2$time.since.logging%in%c(Tmin:Tfinal),]$time.since.logging-Tmin
agb2 <- temp2[temp2$time.since.logging%in%c(Tmin2:Tfinal),]$AGBt.20-min(AGBaft)
time2 <- temp2[temp2$time.since.logging%in%c(Tmin2:Tfinal),]$time.since.logging-Tmin2
RES[a,22] <- coef(lm(agb ~ time-1)) # average biomass recovery rate (Mg.ha-1.yr-1)
RES[a,23] <- (-RES[a,18])/coef(lm(agb ~ time-1)) # time to recover biomass logged
}
}
# Add soil and rainfall variables
soil$site <- as.character(as.factor(soil$site))
RES <- merge(RES,soil[,c("site","bio12", "bio15", "T_SAND", "T_SILT", "T_CLAY","T_BULK_DENSITY", "T_CEC_SOIL", "soil", "texture", "drainage","awc","long","lat")],by="site",all.x=T)
RES$awc <- as.factor(RES$awc)
names(RES) <- c("site", "plot", "trait", "plot.size", "AGB.t0", "AGB.t1", "AGB.min", "AGB.final", "Tmin", "Tfinal", "N.census", "rain", "wsgBA.t0", "treeHarvested.t1", "volumeHarvested.t1", "AGB.logged.t1", "AGB.logged.t1_per", "AGB.lost", "AGB.lost_per", "AGB.lost.residual", "AGB.lost.residual_per", "rec.rate", "rec.time","rain.WC","seasonality", "sand", "silt", "clay","bulk.dens", "CEC", "soil","texture", "drainage", "awc","long","lat")
# Data selection: keep only plot that experienced harvesting, have lower post-logging AGB stocks and positive recovery rates
dat <- subset(RES, treeHarvested.t1 > 0 & AGB.logged.t1 < 0 & rec.rate>0)
# Log transformation of some variables
dat[,c("AGB.logged.t1", "AGB.logged.t1_per", "AGB.lost", "AGB.lost_per")] <- -dat[,c("AGB.logged.t1", "AGB.logged.t1_per", "AGB.lost", "AGB.lost_per")]
dat[,(ncol(dat)+1):(ncol(dat)+8)] <- apply(dat[,c("treeHarvested.t1", "AGB.logged.t1", "AGB.t0","AGB.logged.t1_per", "AGB.lost", "AGB.lost_per","rec.rate","rec.time")],2,function(x) log(x+1))
names(dat)[(ncol(dat)-7):ncol(dat)] <- c("tree.harv_log","AGB.logged_log","AGB.t0_log","AGB.logged.per_log","AGB.lost_log","AGB.lost.per_log","rec.rate_log","rec.time_log")
# Defining plot relative weight
wght<-dat$plot.size*dat$Tfinal # weight of the plots
dat$wght <- wght/sum(wght)
# Load required packages
require(glmulti)
require(lme4)
## Warning: package 'lme4' was built under R version 3.2.5
## Warning: package 'Matrix' was built under R version 3.2.5
require(MuMIn)
## Warning: package 'MuMIn' was built under R version 3.2.5
require(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.2.5
require(textreg)
## Warning: package 'textreg' was built under R version 3.2.5
## TASK: Genetic algorithm in the candidate set.
## Initialization...
## Algorithm started...
## Improvements in best and average IC have bebingo en below the specified goals.
## Algorithm is declared to have converged.
## Completed.
# Alternative models with deltaBIC <= 5
options(na.action = "na.fail")
d <- dredge(lmer(rec.time_log ~ AGB.lost.per_log + AGB.t0_log + rain + seasonality + clay + bulk.dens + CEC + (1 | site)-1, data=dat,weights=wght,REML=F),extra=alist(AIC),rank="BIC") #
dd <- subset(d, delta <= 5)
print(dd)
## Global model call: lmer(formula = rec.time_log ~ AGB.lost.per_log + AGB.t0_log +
## rain + seasonality + clay + bulk.dens + CEC + (1 | site) -
## 1, data = dat, REML = F, weights = wght)
## ---
## Model selection table
## AGB.lst.per_log AGB.t0_log blk.dns CEC cly ran
## 2 1.1060
## 6 0.9978 0.2537
## 66 1.0440
## 4 1.0330 0.05181
## 18 1.1230 -0.001449
## 34 1.1130 -9.552e-06
## 10 1.1040 0.0003363
## ssn AIC df logLik BIC delta weight
## 2 241.8 3 -117.897 248.9 0.00 0.557
## 6 243.1 4 -117.526 252.5 3.63 0.091
## 66 0.003202 243.1 4 -117.564 252.6 3.71 0.087
## 4 243.5 4 -117.771 253.0 4.12 0.071
## 18 243.6 4 -117.819 253.1 4.21 0.068
## 34 243.8 4 -117.895 253.3 4.37 0.063
## 10 243.8 4 -117.895 253.3 4.37 0.063
## Models ranked by BIC(x)
## Random terms (all models):
## '1 | site'
# Best model summary
BM1<- get.models(dd, 1)[[1]]
summary(BM1)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: rec.time_log ~ AGB.lost.per_log + (1 | site) - 1
## Data: dat
## Weights: wght
##
## AIC BIC logLik deviance df.resid
## 241.8 248.9 -117.9 235.8 76
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.9538 -0.5010 -0.1256 0.0744 5.4303
##
## Random effects:
## Groups Name Variance Std.Dev.
## site (Intercept) 0.000000 0.00000
## Residual 0.004898 0.06999
## Number of obs: 79, groups: site, 10
##
## Fixed effects:
## Estimate Std. Error t value
## AGB.lost.per_log 1.10644 0.02203 50.23
# Marginal (Rm) and conditional (Rc) R-squarred
r.squaredGLMM(BM1)
## R2m R2c
## 0.9949779 0.9949779
## observed versus fitted values by site
tempo <- data.frame(x=dat$rec.time,y=exp(predict(BM1)),z=dat$site,col=COL[dat$ID])
par(mar=c(4,8,1,1),las=1)
qplot(x, y, data=tempo,colour=z,fill=col) + facet_wrap(~z,nrow=2)+ theme_bw() + geom_abline(intercept = 0, slope = 1) + labs(x = "Observed recovery time (year)", y = "Fitted recovery time (year) ") + scale_colour_manual(values=COL) + theme(legend.position="none",strip.text.x = element_text(face="bold", size=12),title=element_text(face="bold",size=16),axis.title.x=element_text(vjust=-0.6),axis.title.y=element_text(vjust=1.2)) + geom_point(size=4,stroke=1)
## TASK: Genetic algorithm in the candidate set.
## Initialization...
## Algorithm started...
## Improvements in best and average IC have bebingo en below the specified goals.
## Algorithm is declared to have converged.
## Completed.
We ran a second analysis including logging techniques (conventional (CL, factor=1) and reduced impact (RIL, factor=2) logging), as a binary variable with an interaction with ACS loss. We found that logging techniques had a significant effect and improved predictions of trec (BIC = 244.26 vs. 248.9, see table below). However, we highly doubt the validity of this result, as conventional (CL) logging was applied at only 2 sites (Paragominas and Tabocal), representing only 7.7% of all plots used in our study. Moreover, both techniques were implemented at Paragominas only, with marked difference in post-logging dynamics [13]. Due to its size (24.5 ha), this site has a strong leverage in our analysis, leading to conclusions that have little ecological meaning and robustness.
While an increasing number of studies reveals the benefit of RIL techniques for preserving vital environmental services [10–12], we believe that our dataset is not robust enough to efficiently test for such an effect. Our sites were implemented over the past 30 years, while the concept of RIL techniques emerged in the 90’s. However, we do not believe that such a simple dichotomy might reflect the differences in logging techniques, intensity and damages found among our sites. For this reason, we have adopted a broad definition of ‘ACS loss’ that account for both tree harvested and injured/killed and form a gradient of intensity sensus largo, at which RIL forms the lower bound. We think that this approach reflects better the diversity of logging types encountered in our dataset.
# 1. Variable selection for recovery time
# Alternative models with deltaBIC <= 5
options(na.action = "na.fail")
d <- dredge(lmer(rec.time_log ~ AGB.lost.per_log + AGB.lost.per_log:as.factor(trait) + AGB.t0_log + rain + seasonality + clay + bulk.dens + CEC + (1 | site)-1, data=dat,weights=wght,REML=F),extra=alist(AIC),rank="BIC") #
dd <- subset(d, delta <= 5)
print(dd)
## Global model call: lmer(formula = rec.time_log ~ AGB.lost.per_log + AGB.lost.per_log:as.factor(trait) +
## AGB.t0_log + rain + seasonality + clay + bulk.dens + CEC +
## (1 | site) - 1, data = dat, REML = F, weights = wght)
## ---
## Model selection table
## AGB.lst.per_log AGB.t0_log blk.dns CEC cly ran
## 134 1.0970 -0.7173
## 146 1.0460 -0.02002
## 130 0.8646
## 194 1.0760
## 132 1.1070 -0.2032
## 162 1.0520 -0.0003562
## 138 0.9761 -0.020580
## 150 1.1300 -0.4563 -0.01244
## 210 1.1180 -0.01412
## 148 1.1160 -0.1058 -0.01351
## 136 1.0630 0.1864 -1.2970
## 202 1.1160 -0.012890
## 166 1.0890 -0.9646 0.0001679
## 142 1.1020 -0.6331 -0.005798
## 198 1.1020 -0.6218
## 154 1.0450 0.007415 -0.02432
## 178 1.0690 -0.01696 -0.0000950
## 226 1.1010 -0.0001521
## 196 1.1110 -0.1124
## ssn AGB.lst.per_log:as.fct(trt) AIC df logLik BIC delta
## 134 + 210.9 6 -99.449 225.1 0.00
## 146 + 211.0 6 -99.512 225.2 0.13
## 130 + 213.6 5 -101.775 225.4 0.28
## 194 -0.014210 + 211.7 6 -99.860 225.9 0.82
## 132 + 211.7 6 -99.865 225.9 0.83
## 162 + 212.5 6 -100.257 226.7 1.62
## 138 + 213.2 6 -100.618 227.5 2.34
## 150 + 211.7 7 -98.841 228.3 3.15
## 210 -0.008059 + 212.1 7 -99.039 228.7 3.55
## 148 + 212.4 7 -99.210 229.0 3.89
## 136 + 212.7 7 -99.342 229.3 4.16
## 202 -0.011970 + 212.7 7 -99.370 229.3 4.21
## 166 + 212.7 7 -99.370 229.3 4.21
## 142 + 212.7 7 -99.374 229.3 4.22
## 198 -0.002358 + 212.9 7 -99.435 229.5 4.34
## 154 + 212.9 7 -99.445 229.5 4.36
## 178 + 212.9 7 -99.450 229.5 4.37
## 226 -0.010450 + 213.4 7 -99.699 230.0 4.87
## 196 -0.007467 + 213.4 7 -99.701 230.0 4.87
## weight
## 134 0.156
## 146 0.147
## 130 0.135
## 194 0.103
## 132 0.103
## 162 0.070
## 138 0.048
## 150 0.032
## 210 0.026
## 148 0.022
## 136 0.020
## 202 0.019
## 166 0.019
## 142 0.019
## 198 0.018
## 154 0.018
## 178 0.018
## 226 0.014
## 196 0.014
## Models ranked by BIC(x)
## Random terms (all models):
## '1 | site'
# Best model summary
BM3<- get.models(dd, 1)[[1]]
summary(BM3)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula:
## rec.time_log ~ AGB.lost.per_log + bulk.dens + (1 | site) + AGB.lost.per_log:as.factor(trait) -
## 1
## Data: dat
## Weights: wght
##
## AIC BIC logLik deviance df.resid
## 210.9 225.1 -99.4 198.9 73
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.2879 -0.6226 0.0708 0.3322 5.0987
##
## Random effects:
## Groups Name Variance Std.Dev.
## site (Intercept) 0.350890 0.59236
## Residual 0.002393 0.04892
## Number of obs: 79, groups: site, 10
##
## Fixed effects:
## Estimate Std. Error t value
## AGB.lost.per_log 1.09726 0.12528 8.758
## bulk.dens -0.71729 0.32758 -2.190
## AGB.lost.per_log:as.factor(trait)CL 0.59172 0.07388 8.010
## AGB.lost.per_log:as.factor(trait)RIL 0.27274 0.04433 6.153
##
## Correlation of Fixed Effects:
## AGB.l._ blk.dn AGB.._:.()C
## bulk.dens -0.849
## AGB.._:.()C -0.056 -0.142
## AGB.._:.()R 0.358 -0.479 0.404
## fit warnings:
## fixed-effect model matrix is rank deficient so dropping 1 column / coefficient
# Marginal (Rm) and conditional (Rc) R-squarred
r.squaredGLMM(BM3)
## R2m R2c
## 0.8229308 0.9988005