Global Ocean Tide Models: Assessment and Use within a Surface Model of Lowest Astronomical Tide

The UK Hydrographic Office (UKHO)-sponsored Vertical Offshore Reference Frames (VORF) project aims to develop tidal level transformation models that are referenced to the GRS80 ellipsoid and thus compatible with GNSS positioning; in particular, heighting. Benefits include increasing the efficiency of hydrographic surveying, providing a stable consistent reference frame and enabling integration with land data in the coastal zone. Seven contemporary global ocean tide models are used to derive Lowest Astronomical Tide (LAT) surfaces which are each assessed by comparison with LAT values from the 7,389-strong UKHO tide gauge database, with the results correlated with distance from land. The proportion of truly offshore and pelagic gauges is relatively limited; however, the transition zone whereby the global ocean tide models commence to deteriorate in accuracy is evident at approximately 30km from the coast. The DTU10 model was selected as the strongest candidate overall. Subsequently, a thin plate spline method is used with the tide gauge dataset to enhance the DTU10 LAT surface in the coastal zone, creating a high resolution global LAT surface with respect to mean sea level. It is seen by cross-validation that the method may be used to predict LAT in near-shore locations with a standard error of 0.23 m.


Introduction
The International Federation of Surveyors (FIG) recommends the development of vertical reference surfaces for hydrography (FIG 2006) that are referenced within the International Terrestrial Reference Frame (ITRF; Altamimi et al. 2011) as heights above the GRS80 ellipsoid. This has several benefits, in particular decreasing the cost and increasing the efficiency of hydrographic surveying, enabling the integration of data over the land/sea interface, and observations to long-term harbor installations (UKHO 2011). This data set may be used in two ways: as truth data to assess global ocean tide models and as source data for the LAT surface computation. Following the assessment of global ocean tide models, one candidate will be used to generate a global surface of LAT that will be combined with the tide gauge data to produce a global, high resolution, surface model by an optimized TPS interpolation function.
The accuracy inherent in ocean tide modeling has been well documented (Andersen et al. 1995;Shum et al. 1997;Ray et al. 2009;King and Padman 2005). The quality of the modeling has progressed significantly over the years, so much that the focus is shifting from the deep ocean to the modeling of shallow water and near-coastal tides; Ray et al. (2011) considered the current state of affairs concerning tidal predictions in shelf and coastal waters. Cheng and Andersen (2010) compared the principal constituents (M2, S2, K1, O1) from contemporary global ocean tide models against four tide gauge datasets (725 gauges in total) with an emphasis on shallow water coastal regions. Necessitated by the specific practical applications foreseen for the VORF-Global project, this study differs from previous work assessing tide models in that the focus is on the assessment of the ability to retrieve tidal extrema. The influence of water depth on ocean tides (and hence ocean tide models) is significant, shallow water tides being generally larger and more complex than deep water (Parker et al. 1999). However, as discussed, the project goal incorporates a stipulation of validity beyond the 12 nm bound; consequently the relationship of particular interest in this study is the correlation of the accuracy of global tide models with distance from land. Hence, the behavior of the global tide models is to be assessed according to a suitable distance metric, essentially how far offshore a point is. The value of providing reasonable bounds on the accuracy of global tide models is of importance, certainly if their use for navigational purposes is to be possible.

Global Ocean Tide Models
This study includes the following seven global ocean tide models, the set of which is a fair representation of the state of current progress in empirical ocean tide models: • Center for Space Research version 4.0 (CSR) global ocean tide model (Eanes and Bettadpur 1995 (Lyard et al. 2006). • Goddard Ocean Tide 4.7 (GOT4.7) global ocean tide model (Ray 1999).
Of the seven tide models listed above, the FES, DTU and EOT models have a spatial resolution of 1/8 • , the TPXO models 1/4 • and the GOT and CSR models have a resolution of 1/2 • . The FES2004 tidal atlas is computed from the tidal hydrodynamic equations and has assimilated data from 671 tide gauges and the Topex/Poseidon (T/P) and ERS satellite altimetry missions. The DTU10 model is based upon the FES2004 model and has assimilated altimetry data for a 17-year period from the T/P, Jason-1 and Jason-2 satellites. Similarly, the EOT11a model is based upon the FES2004 model and was developed by the analysis of multimission altimeter data (specifically T/P, ERS-2, Envisat, Jason-1 and Jason-2). GOT4.7 is the latest update of the GOT99 model; it has included observed data from the T/P and Jason-1 missions and also uses ERS and GFO data in shallow water and polar regions. The CSR version 4 model is less recent and thus incorporates smaller quantities of observed data (from the T/P mission), but it is included partly for completeness and partly as a benchmark to mark the progress of tide models; the CSR version 3 model was selected as one of the two models for the reprocessing of T/P data (Shum et al. 1997) and version 4 is also the model used in the VORF-UK project (Turner et al. 2010). The TPXO-Atlas model is an update of the TPXO7.2 model; it has an identical spatial resolution and defaults to TPXO7.2 in deep water; however, in coastal areas high resolution regional solutions have been interpolated onto the coarser 7.2 grid.
A limitation of global ocean tide models is the availability of accurate bathymetry data sets, for example, Ray et al. (2011) highlights significant discrepancies of up to 100 m between bathymetric datasets in a test area, with an inverse tide solution being much improved by the use of a regional bathymetry dataset as opposed to a global dataset typically used for global tide models; generally it may be stated that more recent global tide models feature more accurate depth data compared with older versions.
Lowest astronomical tide values were computed from the tide models by the simple method of, for a given position, running the model over a 20-year period to generate predictions encompassing the full lunar nodal cycle and then extracting the lowest tide height with respect to MSL. A temporal step length of 15 minutes was used in all cases. For validation purposes a comparative test was performed in which a temporal step length of 5 minutes was used as the benchmark; the relative differences were at the sub-millimeter level when using a 15 minute step length.

Tide Gauges
The UKHO Admiralty Tide Tables (ATT) feature tidal predictions at more than 7,000 locations worldwide (UKHO 2011). As well as predictions, tidal levels such as MSL, LAT, Mean Low Water Springs (MLWS) and Mean of Lower Low Waters (MLLW) with respect to local CD are similarly published. These levels are calculated empirically following the collection and subsequent harmonic analysis of observed tidal data from a recording tide gauge. The length of the observation period will influence the results that are obtained from the analysis of those observations, and thence have an influence on the calculated tidal levels. The longer the period of observation, the greater will be the reliability of the results. Data for the 7,000 locations mentioned above will be based on the analysis of observations over a wide variety of time spans ranging from a few days to 18.6 years.
For a harmonic analysis, the minimum observation span considered is hourly height readings over one lunar month (about 29.5 days). If more data are available, separate analyses over additional lunar months are used and can be meaned by vector averaging if there is consistency in the results (i.e., the phase angle and amplitude) of the major harmonic constituents from each "series" analyzed. In this way, a set of 37 harmonic constituents are identified, including 6 inferred from relationships with the major constituents. Longer-term seasonal effects in such records are not evident but can be inferred from nearby locations where longer records do exist.
If a lunar year (or indeed several years) of tidal observations is available (hourly heights over a period of 378 days), the analysis is again split into a grouping method of approximately 30-day periods as a number of continuous series. Once again, the results of each series are checked for consistency. In this way, a total number of 18.6 years observations can be added to the analysis, resulting in a set of 141 harmonic constituents. Longer-term seasonal effects are also able to be accounted for in this analysis and can inform the likely seasonal effects for nearby locations, which are based on shorter observation periods. Another lesser used technique involves a non-harmonic, "time and height differences method," where the daily high and low waters recorded at a secondary port are compared with those occurring simultaneously at a suitable standard (reference) port. This has been used in the past where no sufficiently long record of observation existed and in locations where a full tidal curve has not been recorded (such as drying areas where only high waters are available). Evidently the data obtained in this way is limited but can still be of use to mariners for planning purposes.
The tidal levels are subject to change following the collection and analysis of any new observations at the location concerned; they are akin to tidal predictions in that respect. Therefore, LAT is simply a predicted level obtained by selecting the lowest low water value that occurs over a period of 18.6 years of tidal predictions.
Historically, CD is the fundamental vertical datum to which charted depths and predicted tidal heights are referred, and as previously indicated it is not yet the case that it equates everywhere to LAT. Methods used to establish CD at a particular location are many and varied, and nations that conduct hydrographic surveying will have their own well-established methods of determining it. In many parts of the world, the methodology used results in a chosen level that may be substantially different from LAT; for example, CD in the United States is the level of MLLW; in Brazil it is approximately MLWS. These variations in datum need to be accounted for when making comparisons.
Given a particular tide gauge i, to allow direct comparison with LAT values derived from ocean tide models there must be consistency in the respective datums. Global ocean tide models are referred to their internal MSL; consequently, the following relationship is used to acquire values of LAT with respect to mean sea level at tide gauge points: where LAT i MSL is the height of LAT with respect to mean sea level, and LAT i CD and MSL i CD are the heights for Chart Datum of LAT and MSL, respectively. MSL i CD for each location in the UKHO data set is derived simply by obtaining an average of all the available readings over the period of observation. Ideally this would be a long period, preferably 18.6 years (or longer, in line with the recommendations of the IHO (2011)); however, as described above there are many locations where such a long record of tidal observations are unavailable and so the value obtained is not as reliable. Indication is given in the Admiralty Tide Tables of the monthly (seasonal) variations in MSL likely to be expected throughout the course of a year at each location. The variations do not necessarily repeat themselves exactly from year to year; hence the values given may be found to differ from observed values by about ±0.1 m, even where they have been based on several years' observations. In consequence, where the maximum positive and negative variation of MSL is less than about ±0.1 m it is considered to be negligible. In practice, MSL data are largely based on a relatively small number of observations, and the seasonal changes for many locations have been interpolated using nearby stations possessing longer records of observation, in a similar manner as described above for the harmonic analysis of periods of less than one lunar year. Variations in MSL over short time periods may be considerably greater than the values provided in the Admiralty Tide Tables, and the level may remain as much as ±0.3 m with respect to the average for as long as about a month (Iliffe et al. 2007b).
There are 7,389 tide gauges in the UKHO database. Through historical records 32 tide gauges were identified as being impounded (i.e., local sea bed morphology such as a sill prevents the full tidal ebb) and thus removed from further analyses. Also, 382 tide gauges were identified as on rivers or inland waterways and similarly removed; this was accomplished by examination of the position of each with respect to a generalized coastline, coupled with the use of satellite imagery in uncertain cases. Further, 3 tide gauges were identified as having anomalous tidal level heights vis-à-vis proximate peers (greater than 50 cm difference in LAT MSL ); subsequent investigations highlighted that the provenance of the data was unclear and therefore they were removed from all analyses. An additional 894 tide gauges had insufficient observed data to allow the computation of both LAT and MSL levels. After editing, the subset of UKHO tide gauges which were valid for use in this study numbered 6,078.

Assessment of Global Ocean Tide Models
For the assessment stage, LAT values are derived from each tide model at all tide gauge locations; the majority of tide gauges are located in the coastal zone, typically at the land/sea boundary. Therefore, it is often the case that a tide model may be undefined as a result of model resolution and the land mask used. Ocean tide models have varied spatial resolution; the software that accompanies them perform interpolation methods (typically bi-linear) to acquire predictions at positions not coincident with the model nodes-in all cases in this study the respective ocean tide model software packages are used for this interpolation. The DTU model is, for all intents, continuous and hence provides tide predictions at all tide gauge locations; of the remaining models the GOT model (which has a relatively low resolution of 0.5 • ) returns predictions at significantly fewer locations. Consequently, to ensure an equitable comparison between ocean tide models, only those tide gauge locations at which all tide models returned results are included.
The tide gauge dataset includes information from a significant variety of tide gauges, be they offshore or coastal. Due to the limitations of satellite altimetry and the inadequacies of applying relatively coarse global modeling to shallow water tide prediction (Ray et al. 2011), it is unrealistic for global ocean tide models to be expected to completely represent near-shore tidal behavior. Therefore to distinguish between the coastal and offshore zones the tide models may be assessed by comparisons with the tide gauge data and then analyzed as a function of offshore location. The error at a particular tide gauge is given by: where the RMS comparison between an individual tide model and the tide gauges is given by: A metric denoting how far offshore a data point lies can be a somewhat arbitrary concept and is evidently affected by numerous factors such as the definition of what exactly constitutes land and the local coastal morphology. This is compounded by the requirement to apply the metric on an automated global basis. The first step to defining an offshore metric is the acquisition of a suitable coastline model; the Generic Mapping Tools (Wessel and Smith 1991) features a processed version of the World Vector Shoreline (Soluri and Woodson 1990), which was converted to an ARCGIS shapefile polygon. Subsequently this coastline was processed in the ARCGIS software to compute a reference buffer polygon defining the zone 22 km (12 nm) offshore. Following data cleaning and the removal of spurious polygons, this construct has been used to calculate a distance offshore value for each tide gauge; a value of 0 km indicates that the gauge is located on this buffer (i.e., nominally 12 nm from land), a positive value indicates that the gauge is located toward the coast and a negative value indicates the location is outside the 22 km buffer polygon (i.e., further offshore). This method and convention may be somewhat counter-intuitive; however, the advantage over simply calculating a distance from land is that the distances produced are relative to a smoothed buffer which will have mitigated the impact of complex coastline features (e.g., with this method a data point located 2 km offshore in a narrow inlet is not directly comparable against a data point located 2 km offshore toward the open ocean). It is stressed that, under this metric, it is valid for a point to be located more than 22 km inshore from this buffer polygon if the local coastal morphology so allows. Figure 2 depicts the RMS agreement between the global tide model predictions and the tide gauge data as a function of distance offshore; the results are presented as the cumulative RMS, whereby as land is approached more tide gauges are included in the assessment. Table 1 summarizes the comparisons between the models and gauges under two scenarios: (a) all those sites located either offshore or in the coastal/near-coastal zone (i.e., all tide gauge data located further than 25 km inshore from the reference buffer polygon were removed from the analysis, the locations of those included are shown in Figure 3) and (b) for those sites located offshore only (i.e., outside the 22 km reference buffer).
Evident from the results depicted in Figure 2 are the significant strides achieved in global tide models since the CSR4.0 model, which overall agrees far less with the tide  gauges than its more contemporary peers. A relatively small increase in number of gauges causes a significant increase in the RMS values in the environs of the reference buffer (i.e., 22 km from land), indicating substantial disagreements between the gauges and modeling. In comparison to the inshore area the number of offshore tide gauges contributing to the statistics is low; however, there is a clear indication in the curve of the progressive degradation of tide modeling as land is approached (commencing in the environs of the 30 km bound). While all models except CSR are of a similar high standard, the DTU10 model was selected as the candidate model for the subsequent LAT surface generation due to  Table 1). its higher resolution, strong assessment statistics in all tests and global coverage (the DTU10 model has a forgiving land mask thus returning predictions in some areas where other models do not, such as the Black Sea). The GOT and TPXO-ATLAS models were considered to be close second. The differences between the tide gauge LAT values and the DTU10 LAT values are depicted in Figure 3; these are the 2,569 locations which relate to scenario A in Table 1, therefore they are the same subset of gauges which are represented in the cumulative RMS in Figure 2 up to the limit of +25 km toward land from the reference buffer. There is a certain geographical bias toward Northwest Europe; however, it is not overt and it does cover an area which features shallow water and highly variable tidal regimes hence making it an ideal test bed. Significant discrepancies do occur between the DTU10 model LAT predictions and the tide gauge LAT predictions; typically these are located in coastal areas where the ocean tide model is not predicting a sufficiently high range. This is not unexpected; altimetry in the coastal zone coupled with the complexities of shallow water tides is a challenging field (see, e.g., Ray et al. 2011) and is the prime motivation behind the following section attempting a global surface of LAT by an optimized combination of the two data sources.

Global Surface of Lowest Astronomical Tide
Given a global set of LAT values sourced from tide gauge observations and global tide models, a thin plate spline (TPS) function (Duchon 1977) may be used to combine the data sources to create an enhanced surface of lowest astronomical tide with respect to mean sea level. The particular implementation as used here is described in detail in Turner et al. (2010); in brief, the ocean tide model DTU10 is used to create a global LAT surface at 1/8 • grid spacing. A default mask is applied rejecting all DTU10 data nodes located within 14 km of land. Further, a manual assessment is undertaken whereby the mask is extended further offshore to remove DTU10 nodes in areas with significant disagreement with in situ tide gauges, with a difference criterion of 0.50 m triggering further investigation (this is consistent with Deng et al. 2002, which considers altimetry data as potentially suspect up to 22 km from the coast). Typically this was necessary in areas which featured high tidal ranges and constricted coastal morphology such as the Bay of Fundy, the Severn Estuary and the Gulf of Khambhat. Subsequently two methods are used to generate the LAT surface: beyond 40 km from land the DTU10 nodes are the sole data source and bilinear interpolation is used, while within the 40 km bound the closest 30 valid tide gauge values and the closest 20 valid DTU10 tide model data points are used as control points in a TPS solution. An example is seen in Figure 4, where the source control points used to predict a value of LAT in the Bay of Fundy are shown; also evident in this plot is the adjusted 14 km buffer zone which rejects DTU10-sourced LAT values further into the Bay. Included within the TPS formulation is the concept of "sea distance," which aims to mitigate the influence of tide gauge data points located beyond any intervening coastal morphology. It is noted that for the TPS method the control point positions are transformed to a Transverse Mercator projection defined locally for each interpolation point; this is necessary to ensure relative positions and distances are correct which would not be the case using spherical coordinates.
A TPS is an interpolation technique whereby a mathematical function is fitted to a number of (irregularly spaced) control points in a manner which simulates the behavior of a metal plate. The exact fit of the TPS to each control point is controlled by a regularization parameter λ i . As all λ i tend to infinity the fit to the control points is loose and the plate is essentially a best fit plane. As all λ i tend toward zero the fit to the control points is exact. For the control points used within the tidal level interpolation, λ i is determined by: where λ 0 is the default regularization parameter; r i straight and r i sea are the distances in kilometers from the interpolation point to the data point in a straight line and by sea, respectively; and β is the scale factor determining the extent of the influence of data points located behind headlands. By definition, the distance by sea must be identical to or greater than the straight distance; the effect of Eq. (4) is such that if a control data point i is, for example, behind a headland then the resulting TPS surface solution will be less constrained to this point. It must be noted that a further effect of Eq. (4) is that the values of λ i are location dependent, thus individual TPS solutions must be generated for each position being studied. The parameters λ 0 and β may be determined by a rigorous tuning process whereby a cross-validation method is implemented in which, under a range of parameterizations, LAT values are predicted at each tide gauge data point in turn using nearby control data (i.e., the nearest 20 DTU10 points and nearest 30 valid tide gauge points). Since the tide gauge data point has been temporarily removed from the input data, this procedure may act as an independent test of the method accuracy. The cross-validation method was implemented using a total of 5,509 tide gauges; this is the subset of the 6,078 valid gauges which are located within the previously described 14km buffer zone. A range of tuning parameterizations for both λ 0 and β were considered; each possible combination was tested and therefore computational limitations dictated the extent to which finer tuning could be undertaken. A plot of the RMS agreements under a representative sample of the tuning parameterizations is given in Figure 5. The lowest RMS agreement between the TPS predictions and the validating data was 0.23 m resulting from the optimal values of λ 0 (5295.0) and β (1.0). This test infers that the tuned TPS method gives a global near-shore uncertainty value of 0.23 m in those areas with no available in-situ tide observations. Low values of the regularization parameter λ 0 evidently result in an overconstrained solution with poorer agreement with the in-situ LAT values; also evident from the figure is the importance of determining β, thus optimally accounting for the influence of intervening land in the modeling, with tighter constraints on occluded data points resulting in similarly poorer fits as those for too loose constraints. Table 2 lists the regional (see Figure 1 for region bounds) breakdown of the cross-validation results under the optimized parameterization; of particular note is the Mediterranean where lower tidal ranges ensure the prediction error is relatively low, and also the SE Pacific region where the prediction error is relatively large. This is caused by a number of severe outliers (difference greater than 0.7 m) on the Southern Coast of Chile and the Pacific coast of Colombia. Corresponding to these tuned values, an independent assessment of the method correlated with the distance offshore metric is compared with the unenhanced DTU model in Figure 6; this directly exhibits the improvement in the prediction of the tidal level LAT in the coastal zone using an enhanced combination of tide gauge and DTU10 values. To enable a direct assessment of the TPS prediction method, only those gauges used in the cross-validation method are included in the generation of the plot. This results in a truncation whereby there is no data further than 30 km offshore. However, if all comparisons at gauge locations were included, the TPS and DTU lines would merge beyond this limit. Note that this plot is truncated for the sake of image clarity at the 50 km toward land bound.
Following the tuning of the TPS weighting function, the full tide gauge database and DTU LAT predictions were used to generate an enhanced global LAT surface with 1 arcminute resolution (Figure 7), where in the coastal zone the TPS method was used and in the offshore zone the DTU10 points were used with bi-linear interpolation. As a measure of precision (the TPS method does not enforce the surface to rigidly fit the control points) the RMS agreement between this enhanced surface and the tide gauge LAT values is 0.096 m (6,078 tide gauges in total, comprising all valid tide gauges located in the offshore and coastal zones). A regional breakdown of the final surface fit is given in Table 2, with the smallest RMS being in the Mediterranean (0.04 m) and the highest being in NE Atlantic (0.12 m).

Conclusions
The United Kingdom Hydrographic Office database of tide gauge records has been used as the benchmark to assess the accuracy of tidal extremes derived from seven global ocean tide models. The plotting of the results filtered by distance of the tide gauges from the coast highlights the progressive increases in global tide model accuracy, in both deep ocean and coastal zones, from the older CSR 4.0 model to the more contemporary models such as TPXO-ATLAS and DTU10. Implied from Figure 2 is an approximate bound of 30 km from land, beyond which the ocean tide models are seen to hold constantly with tide gauge data points. The limitation of this is twofold: offshore gauges are located with a relative geographical bias toward the Northwest European shelf, and the tide gauge LAT values themselves have an unmodeled uncertainty. The selection of the DTU10 model as the tide model of choice concurs with the results of Cheng and Andersen (2010), who state that the DTU10 model fits measurements from tide stations better than its peers.
Within the coastal zone a thin plate spline interpolation procedure predicted tidal levels at tide gauge locations using proximate DTU10 and tide gauge LAT values as control points; the removal of data points in turn provides a means of acquiring an independent assessment of global 1-sigma uncertainty of tidal levels in the coastal zone where there have been no in-situ measurements. This RMS agreement of 0.23 m from 5,509 tide gauges is consistent with the value of 0.20 m acquired in a previous study focused solely on the United Kingdom and Ireland (Turner et al. 2010).
A global high resolution surface of LAT has been determined by an optimized TPS interpolation procedure using a global coastline, the DTU10 tide model and the UKHO tide gauge database. In conjunction with a mean sea surface model this provides a continuous global surface of LAT defined within ITRF, providing the ability to transform geo-located vertical data between offshore reference levels, given appropriate testing and assessment. The provision of this continuous and seamless LAT surface is an important step forward in providing navigable surfaces to the mariner on a global scale. It has the potential to better inform the vertical treatment of hydrographic data in areas where access to or deployment of recording tide gauges is difficult or even impossible. As is necessary for SOLAS (safety of life at sea) application, a stringent test schedule is planned before a wide release of the VORF-Global transformation model would become a prospect.