Quantitative NIR spectroscopy for determination of degree of polymerisation of historical paper Chemometrics and Intelligent Laboratory Systems

This paper discusses the development of a near infrared (NIR) spectroscopic method coupled with multivariate analysis to characterise historical paper. Speci ﬁ cally, partial least squares (PLS) regression was used to predict one of the most important properties of paper as a condition indicator – degree of polymerisation (DP). Supported by a set of model cellulose samples, the NIR-PLS method for DP prediction was validated and the modelling approach that led to the best prediction of DP of paper was established. The coef ﬁ cient of variation of the NIR-PLS models were found to be approximately 8% and 20% of the DP of model cellulose and historical paper, respectively. The variance of the reference DP, the variance of the predicted DP, and the model bias were identi ﬁ ed as the main sources of the total expected generalisation error of prediction. For both model cellulose and historical paper, the variance of the predicted DP by the NIR-PLS models contributed the most to the total error of prediction. This suggests that improving the instrumentation and the operation procedure is essential to improve model performance. Furthermore, the effect of water content of the samples on model performance was investigated. The model for historical paper was proven to be robust to relative humidity ﬂ uctuations between 30% and 70%, indicating the applicability of the model for collection surveys in a range of environments.


Introduction
The determination of degree of polymerisation (DP) is of great significance to assess the condition of polymeric materials in cultural heritage [1]. It is one of the most important molecular properties that correlate with the mechanical strength of polymers [2]. But DP is difficult to measure directly, specifically of papers of historical importance. The techniques typically used in chemistry, such as membrane osmometry, size exclusion chromatography, viscometry, and mass spectrometry, can be time consuming, inaccurate for the DP ranges involved, or require specialised instrumentation and skills [3]. On the other hand, given that significant value, including aesthetic, scientific, social, and economic, is always associated with heritage objects, substantial sampling required for destructive analytical methods is rarely an option. In heritage science, a comparably accurate non-destructive method for DP determination is highly desirable.
Quantitative near-infrared (NIR) spectroscopy provides a non-destructive alternative to chemical analysis. Given the complexity of NIR spectra, multivariate analysis is often used to provide a correlationbased quantitative interpretation. In multivariate analysis, the spectral responses to chemical and physical properties of a sample set are modelled based on the measurement of small absorbance changes occurring at multiple wavelengths. Among several linear multivariate methods, partial least squares (PLS) regression has been the most important one for quantitative NIR analyses [4]. PLS constructs factors that capture spectral variability as well as correlating with the reference data, and is usually effective in achieving high accuracy of predictions [5]. In recent years, the NIR-PLS method has gained progresses in the analysis of complex multicomponent mixtures where the accuracy is comparable or even better than the conventional wet chemistry methods [6][7][8][9], which is especially promising for heritage materials. A number of authors have attempted the application of NIR-PLS to analyse DP and DP-related properties, such as molecular weight (Mw) and viscosity of polymers for a range of historical and model cellulosic Abbreviations: CoV, Coefficient of variation; DP, Degree of polymerisation; MSEP, Mean square error of prediction; NRMSE, Normalised root mean square error; PLS, Partial least squares; RMSECV, Root mean square error of cross validation; RMSEP, Root mean square error of prediction; RMSE, Root mean square error. materials. The performances of the models are summarised in Table 1. Normalised root mean square error (NRMSE) for each model was calculated as: NRMSE ¼ RMSE•(y max -y min ) À1 , (1.1) where RMSE is root mean square error, y max and y min are the maximum and minimum reference values in the property of interest. With the available data as shown in Table 1, a clear line can hardly be drawn between the model performance for historical samples and model samples, although historical samples generally exhibit far more complexity in both chemical and physical characteristics. It is noticeable that the results obtained using model paper in ground, powder and pulp differ significantly from those obtained using model paper sheets. This indicates that the performance of NIR-PLS models can be affected significantly by physical properties. It signifies the fact that NIR spectra represent a combination of molecular vibrations, optical properties of the instrument, and the instrument-sample interactions. Material complexity can in theory complicates the development of successful NIR-PLS models.
For historical paper, the model performances vary widely. This can be partially explained by the wide range of raw materials and manufacturing processes of the investigated materials, covering Western paper, Chinese paper and Islamic paper. It is worth noting the data of large variability reported specifically for Western paper [11,15,16]. On one hand, exceptional models were developed that outperform all the reported models for both historic and model cellulosic polymers [15]. On the other hand, difficulties in developing acceptable models were reported for similar materials [16]. Despite this divergence, an instrument using NIR-PLS models to predict DP of paper has been developed and implemented in practice with a relatively high RMSE [11]. These contrasting results for historical western paper may signify the complexity in developing NIR-PLS models for historical materials. Since it is difficult to gain insights based on the limited published data, further research is needed to shed light on the contradictions.
Lack of a clear cause-effect relationship between DP and NIR measurements also complicates model development. Successful quantitative NIR analyses usually require an underlying cause-effect relationship between the analytes and spectral data. In the literature, most successful models based on NIR-PLS have been developed for compositional analysis [6][7][8][9]. These models are mostly based on the Beer-Lambert Law, where changes in absorbance are proportional to changes in the concentration of a chemical component. In contrast, DP is not a property that can be clearly correlated to the concentration of a vibrating bond type.
This imposes an additional challenge to modelling of DP using NIR-PLS.
The issues of material complexity and the implicit DP-NIR relationship may be partially overcome by large data sets, however, the amount of available historical samples for the development of NIR-PLS models is limited due to resource constrains. In this case, to improve model performance, in-depth understanding of the prediction errors becomes essential. However, discussions of error analysis are rarely found in the literature.
To bridge the gaps in the literature and to lay a solid foundation for future research, this paper addresses the identified challenges and difficulties in the development of NIR-PLS models for historical Western paper. Experiments were designed to investigate the approaches for model development and evaluation. Through a comparison of model paper and historical paper, a plausible underlying relationship between DP and NIR response was explored, sources of prediction errors were analysed, and the robustness of the models to environmental fluctuations was assessed. These analyses not only deepen the understanding of the model development and performance, but also ensure the applicability of the NIR-PLS method to collections in practice.

Sample sets and reference DP
The sample sets used for multivariate modelling of DP using NIR spectroscopy are summarised in Table 2. A set of model papers was prepared for a controlled feasibility study. Samples from the same sheet of Whatman filter paper No. 1 were degraded in a VWR VENTI-Line® oven (Radnor, US) at 90 C for up to 5 months. All the samples were hung freely during the degradation and no extra humidity was added to the environment inside the oven. Intrinsic viscosity ([η]) of each sample was determined based on BS ISO 5351 [19] to calculate DP using the Mark-Houwink-Sakurada equation [20]: Table 1 Summary of the reported data in the literature on the use of NIR spectroscopy to analyse DP, Mw, and viscosity of cellulosic polymers by PLS regression. LV represents latent variables of the models. RMSE represents root mean square error. Normalised RMSE (NRMSE) is represented using RMSE of prediction (RMSEP) where available, otherwise using RMSE of calibration (RMSEC) or RMSE. R 2 represents the coefficient of determination of the regression used to obtain the RMSEP or RMSEC (if RMSEP is not available).  where [η] is the intrinsic viscosity value in ml•g À1 . Given the homogeneity of the model samples [21], a single determination of DP was carried out. Duplicate determinations of DP were carried out on five randomly selected degraded samples for uncertainty estimation. The uncertainty of DP determination was 55, assessed using the pooled standard deviation.
The uncertainty was also expressed as coefficient of variation (CoV), which is 1.41% estimated by the average of the CoV of the five pairs of measurements.
The same method of DP determination was used for the set of historical paper by a group of different researchers. This sample set contained mostly European rag papers from various sources, spanning from 14th to 20th century. Each sample was measured twice, in adjacent areas. The uncertainty of the mean of two DP determinations was assessed using the pooled standard deviation of pairs of DP determinations divided by 2 1/2 , which gave DP 29. The average CoV was found to be 2.33%.

NIR spectroscopy
Spectral responses were collected from the samples in both sample sets using a portable UV-VIS-NIR LabSpec 5000 Spectrometer (Analytical Spectral Devices, USA). The spectrometer is equipped with three separate detectors: a 512-element silicon photo-diode array detector for the spectral region 350-1000 nm (spectral resolution: 3 nm) and two TEcooled extended range InGaAs photo-diode arrays for spectral regions 1000-1830 nm and 1830-2500 nm (spectral resolution: 10 nm). It was operated in a diffuse reflection mode and was calibrated against Spec-tralon® Calibrated Multi-Component Wavelength Calibration Standard (WCS-MC-010, Labsphere, US).
The NIR spectroscopy was performed in a well-controlled laboratory environment (23 AE 0.5 C and 50 AE 5% RH). A Florilon™ Standard (EFWS-99-02c, Avian Technologies LLC, US) was used for instrument calibration before each set of measurements was taken. Using a Flo-rilon™ Standard as the sample background during spectra acquisition, spectral measurements of the samples were taken in circular areas of 1 mm in diameter using a fibre-optic probe in close contact with the samples at 0 angle. Full range spectra of 200 scans were obtained from three spots on a single sheet of each sample and the averages were stored and used for further analyses.
Since a Florilon™ was used as the background, it was assumed that little transmission occurred during spectra collection. Although the spectrometer was used in a diffuse reflectance mode, the spectra collected were considered to represent both the surface properties and the bulk properties due to the penetration of NIR radiation through the paper sheets. Fig. 1 shows the spectra of the model paper samples ( Fig. 1 (a)) and the historical paper samples ( Fig. 1. (b)). For model paper, major variability in reflectance was observed in the visible range and relative homogeneity was observed in the NIR range. For historical paper, variability in reflectance across the whole range of the wavelength was observed, with major differences in spectral shapes observed in the visible range. This is an indication of the large variation in material composition and physical properties of the historical samples. A few historical samples showed reflectance greater than 1, which is likely caused by the fluorescence of the additive materials (fillers, sizing materials, etc.) used in the production process [22].

Multivariate analysis
Given that NIR was of the main interest, the spectral range from 1000 nm to 2500 nm was used for statistical analysis for both sample sets. 2400 nm-2500 nm were further removed due to low signal to noise ratios. After the truncations, the spectra were treated by 1st derivative algorithms developed by Savitzky-Golay (SG) [23] only for outlier detection. Spectra showing distinctive visual features were identified as potential outliers and the corresponding samples were examined for confirmation. No outliers were found in the model paper set, whereas nine were identified in the historical paper set and excluded from further analyses. These nine samples displayed unusual material characteristics, included four thick and severely degraded rag paper, one mouldy thick rag paper, one unusually thin rag paper, one exceptionally thick rag paper, one thermal paper and one contemporary paper. Each sample set was then randomly split into two subsets: 2/3 for cross-validated training and 1/3 for independent test. For the training sets, the spectra were pre-treated using SG derivation and standard normal variate (SNV) [24] in sequence before partial least squares (PLS) regression was carried out to model the relationship between NIR reflectance and DP. Fig. 2 presents the reflectance spectra of the model paper samples (Fig. 2 (a)) and the historical paper samples ( Fig. 2 (b)) over 1000 nm-2400 nm pre-processed by 2nd order SG derivation with a window width of 49 and 51 respectively and SNV. The PLS regression models were developed, optimised, and selected based on ten-fold cross-validation on the training sets. The number of PLS factors was determined by choosing the number that gave the first local minimum of the root mean square error of cross-validation (RMSECV). The spectral ranges were estimated based on the ranges reported in the literature for model paper [16] and the overtones of -OH stretching [25] and C--O [26]. The ranges were optimised by trial and error based on the performance in cross-validation.
The same spectral treatments and PLS model coefficients obtained from the training sets were applied to the test sets. The performance of the selected models was evaluated by the root mean square error of prediction (RMSEP) of the test sets, which was taken as the expected generalisation error in this research. NRMSE of prediction (NRMSEP) was calculated based on Equation (1.1). In cases where transformation was applied to the response variable DP, RMSECV and R 2 of crossvalidation (R CV 2 ) were calculated and evaluated in the transformed scale whereas the RMSEP and R 2 of prediction (R p 2 ) were in the original scale. All the analyses were carried out in MATLAB® 2017a with Statistics and Machine Learning Toolbox™; code can be shared upon request.

Model development
In principle, the cause-effect relationship ensures the true predictive validity of the quantitative NIR spectroscopy as an analytical method. For the quantification of analytes, this relationship is usually based on vibrations of chemical bonds. Given that DP itself is not directly represented by a concentration of chemical bonds, the relationship between DP and chemical bonds in cellulose was explored. Under the assumption that the content of oxidised and transformed groups in cellulose fragments are negligible, the amount of three characteristic bonds in cellulose in relation to DP can be estimated as the following: where n g is the concentration of β-1,4 glycosidic bonds between the monomers (mol•g À1 ), n r is the concentration of reducing end groups (mol•g À1 ), n h is the concentration of -OH of cellulose (mol•g À1 ), N is the concentration of monomers (mol•g À1 ), and DP is the average number of monomers in a cellulose chain. In these relationships, the concentrations of bonds are all reciprocally related to DP. This reciprocal relationship still holds when oxidation takes place in the degradation process. Therefore, it was hypothesised that a correlation is more likely to be found between the reciprocal of DP and the measured spectroscopic response. This hypothesis that the reciprocal transformation leads to better model performance was validated using the model paper set. In addition to DP and DP À1 , ln (DP) was also used as an option for response variable because logarithm is a common transformation to correct nonlinearity. Three different PLS calibration models were built and the results are presented in Table 3. PLS scores, which are linear combinations of the mean-centred reflectance at each wavelength multiplied by the coefficients for each PLS factor, were calculated. In all three cases, the scores of the first PLS factor explained the majority of the variance in the spectral response. The relationship between the score of the first PLS factor and different transformations of DP of the training set was plotted across the model paper samples in Fig. 3. The score of the first PLS factor clearly shows a curved dependency on DP (Fig. 3. (a)). However, this curvature was straightened by a reciprocal transformation of DP ( Fig. 3  (b)), which suggests that a reciprocally transformed fit better approximates the linear relationship between DP and PLS factors.
Furthermore, for comparison, predictions obtained from all the models were converted to DP to obtain RMSEPs in the original scale, i.e. DP. As shown in Table 3, a logarithmic transformation of DP decreased the RMSEP by 76% and 44% compared to DP and DP À1 , respectively. Therefore, ln (DP À1 ) was considered as the most appropriate transformation of DP for model calibration, which was simplified as ln (DP).
Using a logarithmic transformation of DP, PLS regression analysis was carried out for the data sets of model paper and historical paper. Table 4 summarises the pre-processing algorithms, the spectral range, the number of PLS factors, and the performance of the models that were found to have the best predictive capacity based on cross-validation on the training sets. No outliers were excluded during model calibration and validation. Wide window widths were used for the pre-processing by SG derivatives suggesting that the spectra might be noisy. RMSECV are in logarithmic scales whereas RMSEP is in DP for comparison with  viscometry and similar research. The model paper set provided a controlled study of the NIR spectroscopy as an analytical method for DP prediction. Fig. 4 presents the PLS regression results for the model paper samples. Although these samples were chemically degraded, DP was considered as the dominant varying property. The changes in these samples are likely to be mainly related to the scission of cellulose chains with little effects originating from naturally occurring paper components and degradation products, which could potentially complicate the NIR modelling and increase the prediction error. As shown in Fig. 4, the good linearity between the reference values and the modelled values in both training and test sets demonstrates that a linear relationship between the spectral response and DP could be well established by PLS regression. The RMSEP of DP for model paper set was 112,~8% across DP range 468-2462, consistent with the performance of the training set. This can be considered as a baseline of the expected generalisation error for DP prediction of cellulosic materials.
Although wavelength assignment of NIR spectra is complicated due to many broad and overlapping bands corresponding to overtones and combinations of fundamental vibrations in the mid-and far-IR region [4], the PLS regression analysis for model paper gave an idea of the most critical spectral range for DP prediction. As shown in Table 4 for model paper, the spectral range used for the PLS model is roughly from 1130 nm to 1620 nm, including bands of the 1st overtone assigned to -OH stretching (1428-1591 nm) [25] and 3rd overtone assigned to aldehyde and ketone (1436-1478 nm) and conjugated aldehyde (1461-1495 nm) [26]. This provides additional evidence that the success of the NIR-PLS method is likely to be based on the concentration of glycosidic bonds, reducing end groups, and -OH groups in cellulose.
The historical paper set required more complicated calibration models with a wider spectral range and more PLS factors than the model paper set (Table 4). The results of the PLS regression analysis for the historical paper samples are presented in Fig. 5. Good linearity was also obtained between the reference DP and the modelled DP, however, larger errors were obtained compared to model paper. This is likely a reflection of the variability of historical paper regarding the inhomogeneity of physical and chemical properties, including surface texture, thickness, fibres, sizing, and the degradation products accumulated over the centuries. Consistent performance of the training set and the test set was observed across the DP range of 427-4071.

Error evaluation
Errors in the spectral and the reference data as well as the bias of the statistical model affect the accuracy and precision of the predictions by the NIR-PLS method [5]. To gain insights into the errors of DP prediction by NIR-PLS method, RMSEP was used to represent the expected generalisation error. Mean square error of prediction (MSEP) was derived based on the bias-variance decomposition [27]: where MSEP is the square of RMSEP, σ 2 is the sample variance in reference DP, Bias 2 [f(x)] represents the residual model error between the best fitting PLS approximation and the true model for DP, and Var [f(x)] is the variance in the predicted DP of the test sample set.
The pooled standard deviation of the reference DP specified in Methodology was used to represent the σ for model paper and historical paper samples. Given that each spectral measurement represents the average of three spectra, the variance of the predicted DP was estimated as one third of the pooled variance of 20 duplicates randomly selected from the test sample sets. Measurements were taken from two different spots on each sample which gave standard deviation of DP 98 for model  Fig. 4. Correlations between the modelled DP by PLS regression and reference DP measured chemically for (a) training and (b) test data sets of model paper samples consist of Whatman filter paper No. 1. Parameters presented in Table 4 were used for the PLS regression analysis. Note that data of the training set are presented on logarithmic scale whereas data of the test set are presented in the original scales, i.e. DP .
paper and DP 129 for historical paper. The bias of the PLS regression model from the true model was difficult to quantify directly. Therefore, the percentage contribution of bias 2 was inferred by subtracting the contributions of the variance of reference DP and predicted DP from 100%. Table 5 summarises the contributions of the variance of reference DP, the variance of the predicted DP, and the square bias of the model. For both model paper and historical paper, the variance of predicted DP was found to be the largest contributor to the total variance estimated by MSEP. This is consistent with what Lu et al. [5] reported that the performance of the models for natural materials were sensitive to random noise in the spectra. Various sources can cause the variance in prediction, including the instrumentation, the chemical and physical inhomogeneity of the samples, and the measuring procedures. Given that the percentage contribution of Var [f(x)] of model paper samples (77%) was found to be higher than historical paper samples (49%), and model paper samples were inherently more homogeneous in chemical and physical properties than the historical samples, it is likely that the instrumentation and the measuring procedures contribute the most to the variances. Therefore, optimising these two factors can be crucial to the minimisation of the prediction errors.
For model paper,~90% of the total variance coming from the samples and~10% from the square bias of the PLS regression model. This indicates that the NIR-PLS method is capable of modelling DP of paper with good accuracy and precision. But for historical paper, the estimated square bias of the model was evidently high,~50%. Since historical paper is much more complex in chemical and physical properties than model paper, it is likely that the NIR-PLS method was largely affected by these complexities. However, it is worth noting that this estimation was based on the estimation of σ 2 . Unexpectedly, σ 2 of the historical paper samples was evidently smaller compared to that of the model paper samples. This could be the result of underestimation due to adjacent sampling for DP measurements by viscometry.
It was difficult to quantify the possible underestimation of σ 2 of historical paper based on available data. However, according to recent research, the CoV of historical rag paper samples was estimated 8.6% on average [21]. Given that more than 90% of the reference samples were historical rag paper, it was hypothesised that applying a correction factor of 3.78 to the CoV of the historical paper sample set could lead to more reasonable estimations. Under this hypothesis, σ 2 was estimated to contribute 33% of the total error and the square bias of the model contributed 18%. This estimation looks consistent with the estimation for model paper, where the variance of predicted DP contributed the most to the model error and the model bias contributed the least. But further research is needed to verify the validity of this approach.

Effect of moisture content
Water content is one of the most important factors that cause uncertainty in quantitative NIR spectroscopy [28]. This is because the NIR spectral range covers multiple O-H vibrations of H 2 O, including a strong absorption at 1930 nm, and the O-H bond is particularly active and intense due to the large mass difference between the atoms [28]. However, the effect of moisture content of samples on quantitative NIR prediction of DP has not been studied. Since the NIR models are usually developed using samples conditioned in controlled laboratory environments, typically 23 AE 0.5 C and 45% AE 5% RH [29], it is important to understand the effect of moisture content for applications of the models in practice, where the moisture content of the samples may vary greatly.
Given that the complexity of the NIR spectra and the chemical composition of historical paper, the sensitivity of DP prediction on changing moisture content is difficult to be derived from direct interpretation of the spectra. Therefore, an experiment was designed to assess the effect of moisture content on DP predicted by NIR-PLS models. Triplicate samples of model paper and historical paper were conditioned at different moisture contents by equilibrating the samples at the same T (23 AE 0.5 C) but varying RH (20%-80%) in a climate chamber. For each condition, triplicate NIR spectra were collected from three spots on each  Table 4 were used for the PLS regression analysis. Note that data of the training set are presented in logarithmic scales whereas data of the test set are presented in the original scales, i.e. DP .

Table 5
Summary of the percentage contributions of the three sources to MSEP. Historical paper' represent the results where the CoV of historic paper was corrected by a factor of 4 for the possible underestimation of variance in reference DP. sample. The average of the measurements was used to calculate the predicted DP of each sample using the PLS regression models specified in Table 4. The average mass of each sample before and after each set of NIR measurements was used to represent the mass of the sample during the measurements. Relative moisture content was expressed as the percentage mass change in the sample compared to the mass at 23 C and 50% RH, at which the NIR-PLS model was developed. Fig. 6 presents the scatter plots of the predicted DP in relation to the relative moisture content for model paper (Fig. 6 (a)) and historical paper ( Fig. 6 (b)). Substantial change in the predicted DP of a model paper was observed when moisture content of the samples fluctuated between À3% and þ5%. This observation suggests that the NIR-PLS model for DP of model paper is not robust to varying moisture contents, although the spectral range including the most intense -OH vibration of H 2 O was excluded from the model. In this case, samples of a range of moisture content need to be used for the development of the NIR-PLS models to achieve improved robustness.
However, a closer look at the dependency of predicted DP of model paper on the relative moisture content reveals that the relationship can be approximated exponentially across the range investigated. Transformed on logarithmic scale with base 10, this dependency can be approximated with a good linear fit over the range AE3% (Residual Sum of Squares ¼ 0.02, R 2 ¼ 0.99). Therefore, in the case of model paper, the NIR-PLS model developed at a single moisture content can be extrapolated for predictions at other moisture contents. This extrapolation is valid for a conditioning environment of 30%-70% RH at 23 C, estimated by the equation developed by PaltaKari and Karlsson [30].
In contrast, when the moisture content of samples fluctuated within AE3% (30%-70% RH at 23 C), the variation of the predicted DP of historical paper was found to be within the range of model uncertainty ( Fig. 6 (b)). This observation suggests that the NIR-PLS model for historical paper is reasonably robust to the change of moisture content of the samples, thus no correction may be necessary for the application of the NIR-PLS model to the samples with different moisture contents. The reason for this observed stability in prediction can be complex and is likely associated with the complexity of historical paper. Part of the complexity may come from the natural ageing processes, during which the void structures in cellulose are changed and the structural resistance to the disruption caused by water absorption is increased [31]. It is worth noting that there is a tendency of increasing uncertainty as the change in moisture content increases, which may limit the accuracy of the predictions at extreme moisture contents.

Conclusions
To clarify the inconsistency in the literature and lay a foundation for future research, this paper systematically investigated the development of a NIR spectroscopic method coupled with PLS regression to nondestructively predict the DP of historical paper. The feasibility of the NIR-PLS method was studied using model paper samples composed of almost pure cellulose. Using log-transformed DP as response variable, satisfactory NIR-PLS models were established for model paper and historical paper. RMSEP was found to be DP 112 (~8%) for model paper and DP 185 (~20%) for historical paper. The larger error for historical paper is likely to be a result of the inhomogeneity of the samples, which is caused by the differing chemical and physical properties due to the original materials, manufacturing processes, and accumulation of degradation products over time.
RMSEP was taken as the expected generalisation error of the NIR-PLS models for error analysis. The variance of the reference DP, the variance of the predicted DP, and the bias of the NIR-PLS model from the true model for DP prediction were identified as the three sources of error. For both model paper and historical paper, the variance of the predicted DP contributed the most -77% and 49% for model paper and historical paper respectively. This suggests that the performance of the current NIR-PLS models is mainly limited by the repeatability of the NIR measurements, which may be improved by enhanced precision and accuracy of the instrument and the operational procedures. The NIR-PLS method was found adequate to model DP of model paper but was likely to be affected by the complexity of the historical paper.
To assess the practicality of the NIR-PLS models, the effect of moisture content of the samples on DP prediction was investigated. With the fluctuation of moisture content being AE3%, substantial variations in the predicted DP were observed for model paper, whereas the variations contained within the model uncertainty for historical paper. This suggests that the NIR-PLS models developed using model samples tend to be overly ideal for real conditions and should be avoided in real applications. Since the NIR-PLS model for historical paper was found robust to environmental conditions equivalent to 30%-70% RH at 23 C, the quantitative NIR spectroscopic method can be applied to historical paper with confidence in a range of environments to obtain plausible predictions of DP.
Author statement Y. L.: conceptualisation, formal analysis, investigation, methodology, software, validation, writing -original draft preparation. T. F.: conceptualisation, methodology, supervision, writing -review & editing. M. S.: conceptualisation, methology, resources, supervision, writing -review & editing. Fig. 6. The dependency of DP predicted by NIR spectroscopy on the moisture content of (a) model paper and (b) historical paper. The empty circle represents the reference DP values measured by viscometry and the error bars represent the standard deviation of the triplicate DP values predicted using the NIR-PLS model. For the model paper samples, the error bars are invisible compared to the magnitude of the change in DP caused by the change of moisture content.

Declaration of competing interest
The authors declare that they have no competing financial interests or personal relationships that have influenced the research presented in this paper.