Data and evidence challenges facing place-based policing 339 Data and evidence

Purpose – The purpose of this paper is to use an evaluation of a micro-place-based hot-spot policing implementation to highlight the potential issues raised by data quality standards in the recording and measurement of crime data and police officer movements. Design/methodology/approach – The study focusses on an area of London (UK) which used a predictive algorithm to designate micro-place patrol zones for each police shift over a two-month period. Police officer movements are measured using GPS data from officer-worn radios. Descriptive statistics regarding the crime data commonly used to evaluate this type of implementation are presented, and simple analyses are presented to examine the effects of officer patrol duration (dosage) on crime in micro-place hot-spots. Findings – The results suggest that patrols of 10-20 minutes in a given police shift have a significant impact on reducing crime; however, patrols of less than about 10 minutes and more than about 20 minutes are ineffective at deterring crime. Research limitations/implications – Due to the sparseness of officer GPS data, their paths have to be interpolated which could introduce error to the estimated patrol dosages. Similarly, errors and uncertainty in recorded crime data could have substantial impact on the designation of micro-place interventions and evaluations of their effectiveness. Originality/value – This study is one of the first to use officer GPS data to estimate patrol dosage and places particular emphasis on the issue of data quality when evaluating micro-place interventions.


Introduction
It has been known for nearly two centuries (Quetelet, 1842) that the risk of crime is not uniform, but spatially clustered (for a recent review, see Johnson, 2010). And, since the police have limited resources, if their objective is to maximise their deterrent effect, rational resource allocation means deploying resources to those areas of greatest risk. The principle aim of such interventions is to change the cost-benefit calculus an offender engages in when deciding whether to offend or not (Cornish and Clarke, 2003). The fact that intervention takes place at the location where crimes are most likely is important. Although more distal or general deterrence effects (such as tougher sentences) may have some impact, "the pleasures of crime are immediate and so carry greater weight than the delayed costs of crime in the would-be offender's calculus" (Paternoster, 2010).
Over the last half century, a growing body of research has shown that police interventions targeted to where high levels of crime have historically been recordedcrime hot-spotscan be effective at reducing crime (Braga et al., 2014). Hot-spot policing strategies have recently begun to focus on increasingly smaller areas (only 50-250 metres across), where the risk of future criminal events is estimated based not only on how many crimes have happened in the past, but also how recently they occurred (e.g. Johnson et al., 2007). Several studies (e.g. Ariel et al., 2016;Williams and Coupe, 2017) have also explored the relationship between the amount of time that police officers spend patrolling such hot-spotsoften referred to as the police "dosage"and the volume of crime that occurs within them.
Such evaluations are welcome and necessary if hot-spot patrolling strategies aimed at deterring crime are to be improved. However, potential data error and uncertainty warrant greater consideration for such focussed interventions, as they have the potential to significantly alter the conclusions drawn, and the policy decisions that follow. Specifically, there are three factors of data precision that need to be addressed: spatial attributes of crimes, temporal attributes of crimes, and police patrol tracking. The precision of the crime data will have an impact on which areas will be designated as hot-spots, how well those hot-spots predict actual risk of crime, and evaluations of crime reduction strategies in hot-spots. To robustly evaluate the impact of police presence on crime, precise and accurate patrol data are necessary to understand where and when dosage was actually applied. Hot-spot evaluations do not typically include a detailed assessment of the uncertainty associated with the data on which they are based. This paper begins by reviewing the hot-spot policing literature, before presenting an argument for why data uncertainty is a more pressing issue now than ever before. For the purpose of illustration, data from a study conducted in London (UK) are then presented, and the effect of dosage on crime occurrence examined. Through this analysis, the quality of the underlying data is examined and the assumptions made outlined with substantial emphasis placed on their potential impacts on the results. In recognition of the trend towards more focussed hot-spot policing strategies and the greater sophistication with which they are starting to be evaluated, the effects of these challenges are discussed.

Police patrol and hot-spot policing
The deterrence effect of police patrolling has a strong theoretic basis in the works of several eighteenth-century scholars. Beccaria (1764) posited that certainty of punishment has a greater deterrent effect than severity of punishment. Few crime control mechanisms can be targeted (in space and time) with such agility as police patrols, and such activity sits at the heart of most models of routine policing. Early evaluations of such tactics, however, produced mixed results. The Kansas City Preventative Patrol Experiment (Kelling et al., 1974) was one of the first experiments to examine the effect of police patrol on crime, and suggested that increasing police vehicle-based patrolling had no impact on reported crime. However, Kelling's study has subsequently been criticised on several grounds. On one hand, the effects of police patrols were tested over a large (32-square mile) region, rather than being focussed on high crime areaspotentially diluting the effects of intervention. Moreover, serious statistical bias was discovered in the original research which may have affected the conclusions drawn (Sherman and Weisburd, 1995). Kelling's study had a profound effect on the perceived efficacy of police patrol for the following two decades (Sherman and Weisburd, 1995).
As analytic techniques have evolved to focus on more granular spatial units, the body of evidence has come to consistently suggest that, when focussed on high risk or "hot-spot" areas, police presence has a significant deterrent effect on crime. In their systematic review, Braga et al. (2014) identified 19 primary evaluations of hot-spots policing interventions that met their inclusion criteria. Statistical meta-analysis of the data indicated that, overall, these interventions produced a significant reduction in crime of 18.4 per cent and that crime was not displaced to nearby locations (see also Bowers et al., 2011).
Traditionally, hot-spots have been defined as small areas with consistently high rates of crime. In 2015, Weisburd coined the law of crime concentration to highlight the consistency 340 PIJPSM 41,3 with which crime has been found to concentrate in such areas (Weisburd, 2015). As well as defining and testing at higher spatial resolution, recent efforts (e.g. Johnson et al., 2007) have also incorporated temporal dynamics into the identification of hot-spots as locations at which a crime has occurred are at increased risk of being victimised again soon afterwards (Haberman and Ratcliffe, 2012). These repeat victimisations are typically the work of the same offender, who will also often return and select targets in the near vicinity of the previous offencea process known as near-repeat victimisation (e.g. Johnson et al., 2009). Bowers et al. (2004) used these principles to create a crime mapping method, ProMap, that incorporated both the spatial and temporal attributes of offences.
Evaluation of such "prospective" methods has found they more accurately predict where crime will happen than traditional, "retrospective" techniques ( Johnson et al., 2009). A quasi-experimental evaluation of a prospective hot-spot policing initiative based on these principles, undertaken in Manchester (UK), estimated that 338 domestic burglaries were prevented over a 12-month period (Fielding and Jones, 2012). A more sophisticated system, PredPol, which is based on the same principles, has been deployed in Kent (UK) and Los Angeles (USA). An evaluation of the system indicated that it was 1.4-2.2 times more accurate than predictions produced by dedicated crime analysts, and a randomised control trial found that hot-spots patrolling informed by PredPol predictions was associated with significant (7.4 per cent) reductions in crime (Mohler et al., 2015).

Uncertainty in crime data
As the precision with which police patrols are targeted increases, spatial and temporal inaccuracies (and omissions) in crime data will have a greater impact on the veracity of analyses conducted. Hart and Zandbergen (2012) expressed concerns about geocoding: the process of converting address data (such as building, street name, and postcode) into a set of spatial coordinates. Generally, the process involves taking an address to be geocoded (e.g. a crime location) and assigning to it the coordinates of the "best match" among a data set of all known addresses. Depending on the details of the address, this could be a specific building, the centre point of a road, or the centre point of an area (e.g. if only a postcode is known). Often, however, these outputs are treated as point locations in subsequent processing, regardless of how they were derived. Moreover, a further complication is that crime events may not occur at a precise addressable location, but (say) on a street.
Previous studies have examined the spatial (in)accuracy of crime data. Robbery data for an area in Birmingham (UK) was found to be correctly geocoded in only 31 per cent of cases, with the average robbery incorrectly geocoded by 193 metres (Harrell, 2014). Research by Johnson et al. (2006) found that in two UK police force areas, address data were incomplete for 40-70 per cent of vehicle-related offences. Such issues are not confined to crimes that occur on the street. A study examining residential burglaries in an area of New Jersey (USA) found that, when using the geocoding process built into Google Earth, crimes were inaccurately geocoded by an average of 27 feet in urban environments and 77 feet in suburban environments (Mazeika and Summerton, 2017). This was significantly more accurate than using the US Census "TIGER" street geocoding system, which produced an average error of 122 and 322 feet, respectively, highlighting how different geocoding techniques can substantially affect the spatial accuracy of geocoding.
The recording of the precise timing of a crime event can also present a challenge since there may be uncertainty over when it actually occurred. Methods for accounting for such uncertainty, whilst rare, do exist in the literature. Ratcliffe (2000), for example, proposed a method whereby the timing of an (aoristic) crime is expressed as a probability density, spread uniformly across the potential range of times it could have occurred. Ashby and Bowers (2013) considered different estimation techniques to examine how using the earliest, latest, or average times (that the crime could have occurred according to the victim) impact on aggregated crime analyses. Using data from CCTV footage to establish the actual time of (bike) thefts at a train station, they found that using either the aoristic approach described by Ratcliffe, or an equivalent approach, closely estimated the distribution of known times. Boldt and Borg (2016) found similar results for residential burglary in a study in which known times were determined by either burglar alarms recording the exact time they were triggered, reports from victims who were at home at the time of the offence, or third-party witnesses.

Measuring patrol dosage
Significant gaps also exist regarding the measurement of police patrol activity. In this case, an important distinction should be made between planned and actual activity. A failure to deliver what was planned in full or in part is a frequent problem in crime prevention practice (see Knutsson and Clarke, 2006), and means that activity realised in practice may differ substantially from that intended. This has the potential to undermine studies which adopt an "intention-to-treat" evaluation model, whereby implementation activity is assumed but not measured, and highlights the importance of directly measuring policing dosage.
In the Minneapolis police experiment (Sherman and Weisburd, 1995), trained observers were sent to active hot-spots to measure the length of time officers spent in each, and the criminal and disorderly behaviour observed in and around them. The dosage of police patrol at each hot-spot varied significantly and cars "stayed at one spot for as long as an hour or more, or for only a few minutes" (Sherman and Weisburd, 1995). Koper (1995) examined the association between varying levels of patrol dosage and crime. According to Koper, the amount of time officers spent in hot-spots was critical. Fewer than 10 minutes of presence produced no more of a deterrent effect than officers driving through a hot-spot without stopping. Longer intervals had a more beneficial effect, but Koper (cautiously, as the result was not statistically reliable) suggested that there were diminishing returns after 15 minutes. This dose-effect association has come to be known as the "Koper Curve", and 15 minutes is a staple guideline for patrol time allocation for many police organisations (Perry, 2013).
Since Koper's study, few have looked at how the amount of police patrol dosage impacts crime. One of the difficulties with such research concerns measurement. As patrols are directed to increasingly small micro-areas, it follows that the measurement of patrols needs to be increasingly precise. Before the introduction of GPS-enabled radios, police patrols were monitored in two ways: officers reporting via radio whenever they entered or left a patrol zone (e.g. Telep et al., 2014), or by independent observers (e.g. Sherman and Weisburd, 1995). Both methods require significant resources and the resulting records are vulnerable to inaccuracies. For instance, officers may not always remain in the hot-spots for the duration of the time logged. This can occur either because they are not aware they have strayed beyond its boundary (Sorg et al., 2014) or because officers, "either through boredom or a perception that they were displacing crime to nearby streets would stray for a time if they were aware of areas of interest just beyond the foot patrol area" (Ratcliffe et al., 2011). These issues are exacerbated for dynamic hot-spot patrols (e.g. ProMap) for which the hot-spot locations themselves change frequently.
The recent proliferation of GPS technology allows for passive data collection of officer movements with much greater accuracy than was previously possible. One of the first studies to use GPS data (Ariel et al., 2016) found police patrols to have a significant impact upon crime and disorder. However, several details associated with the research are important to note. First, significant operational challenges were encountered in delivering dosage, with the average patrol lasting "8 minutes, and lasting not more than 10-15 minutes" (Ariel et al., 2016). Furthermore, in the analysis conducted, patrol dosage was measured in the aggregate over the entire study period, meaning that the variation in patrol dosage per day or shift was not considered. Williams and Coupe (2017) also measured police patrol dosage using GPS data. Specifically, the study was concerned with whether more frequent but shorter periods of 342 PIJPSM 41,3 patrol dosage (9 periods of 5 minutes each) had a greater or lesser impact on crime than less frequent but longer (3 periods of 15 minutes) patrols. Sherman's (1990) "crackdown, back off" theory hypothesises that the deterrent effect generated by police patrols "decays" once there is no police presence. Williams and Coupe hypothesised that more frequent patrols "might arguably allow less time for what Sherman calls 'deterrence decay' to kick in, so that there would be less crime". However, their findings suggest that the longer, less frequent patrols were more effective at preventing crime. Unfortunately, Williams and Coupe did not report on the accuracy of their (GPS) data or the sensitivity of their findings to the kinds of problems explored in this paper, meaning that their findings may be open to errors of inference.
To elaborate, GPS data come with some drawbacks. Chief among these is the fact that signals do not account for every step in an officer's path. In their study, Ariel et al. (2016) were able to use 1-minute refresh rates. However, discussions with three UK police services suggest that operational ping rates are generally every two to five minutes largely due to data collection costs and radio battery life considerations. Williams and Coupe (2017) did not report the time between GPS pings in their study. Given the delays between the recording of foot-patrol locations (even if this is only one minute), to establish the paths taken between GPS pings requires interpolation. If employed as a micro-level measure of dosage, this can introduce errors into patrol evaluations (which will increase with the latency between GPS pings).
The studies discussed so far, and the illustration provided later in this paper, are concerned with foot-based police patrols. Vehicle-based police patrols are less likely to be impacted by the GPS issues discussed for two key reasons. First, automated vehicle locator GPS pings usually occur much more frequently; either every 10-15 seconds or every few hundred metres of travel (e.g. see Weisburd et al., 2015). Second, vehicles are confined to the road network and, as such, their potential paths between pings are much more restricted and thus easier to interpolate accurately when compared to foot-based officers who have no such restriction. For this reason, we focus our attention on foot-based patrol dosage and remain mindful that the issues discussed may occur, albeit in a more limited way, for vehicle-based patrols.
The remainder of this paper takes a hot-spot policing operation implemented in the UK to illustrate the above issues, placing particular emphasis on the accuracy and uncertainty of the crime and officer movement data used in an evaluation of the operation.

The present study
The present research uses data from a trial conducted by the Metropolitan Police Service (MPS) in the London borough of Southwark (UK), where they used a prospective hot-spot technique to direct foot patrols. For each police shift (of which there were three per day) as part of their tasking brief, police officers were given maps on which a series of 250 × 250 metre "prospective boxes" were identified for police patrols.
The aim of the exercise is to illustrate the issues discussed above by working through an example. To do this, we examine the estimated impact on crime of variation in police patrol dosage, placing particular emphasis on the strengths and limitations of the data in terms of data accuracy and usability. Before describing the methods and analyses, the next section outlines the data used in greater detail and provides descriptive statistics regarding their distribution. Data for intended patrol locations, officer movements, and police-recorded crimes were provided by the MPS for the period 1 October 2014 to 30 November 2014.

Intended patrol locations
Intended patrol locations were identified using a proprietary predictive algorithm developed by a third-party partner working with the MPS based on the principles proposed by Bowers et al. (2004). The algorithm produces a risk score (essentially time-weighted kernel density estimator) for each 250 × 250 metre cell of a grid that covered the study area. For each shift, the cells most at risk were provided to patrol officers with the intention that they would patrol those areas when possible. These prospective boxes were generated for seven crime types (burglary, theft from a person, theft from a motor vehicle, theft of a motor vehicle, criminal damage, robbery, and violence with injury (VWI)) based on the priorities of The (London) Mayor's Office for Policing and Crime (MOPAC). The number of boxes generated for each shift and each crime type varied from 3 to 10, and a total of 5,697 boxes were identified over the study period. However, officers had some discretion as to which crime types to prioritise and, consequently, exactly which boxes they sought to patrol during any given shift is unknown.
The locations of the boxes changed frequently. Over the two-month period, 388 unique locations were designated as prospective boxes. The least common boxes were identified only once, while the most common were identified 155 times. On average, a location was a prospective box during 21.65 police shifts.

Officer location data
Officer movement data were collected in the form of GPS "pings" from body-worn radios. Pings were sent from the radios whenever an officer initiated a call or every five minutes. They have a circular error probability of 5 metresthat is, for 50 per cent of pings, the true location is within 5 metres of the reported location (95 per cent within 10 m). The data include the officer's call-sign, the time (to the nearest second), and their location (specified at a resolution of one metre). A total of 239,115 officer pings were recorded in Southwark during the study period.
To estimate how much time each officer spent at each location, it was necessary to interpolate their location between pings. A variety of approaches could be taken to do this; here, a "join-the-dots" method was used, for which the "assumed path" was taken as the direct line between two sequential pings. Where the assumed path intersected multiple grid cells, the officer was assumed to be walking at constant speed between the two pings, and their entry and exit times for each cell were calculated accordingly. The amount of time they spent in each cell was then estimated.
A number of further processing steps were also implemented to address possible sources of error. Where the time between consecutive pings was greater than 15 minutes, this section of the path was discarded from the analysis due to concerns that the officer's actual path may be substantially different from the assumed path. Similarly, where the speed at which the officer appeared to be moving was greater than 2 metres per second, the assumed path was discarded as it was assumed the officer was not on foot.
If more than one officer was in a prospective box at the same time, dosage was calculated as the union of their times in that cell. For example, if two officers were in the same prospective box at the same time for N minutes, the estimated dosage would be N minutes, not 2N. This approach was used as previous research suggests that the presence of more officers does not necessarily lead to greater deterrence (Kleck and Barnes, 2010).

Police-recorded crime
Police-recorded crime data were provided for all seven MOPAC priority crime types, of which a total of 2,459 incidents occurred during the study period. Data provided detailed the crime type, offence location, and the earliest and latest dates and times at which the offence could have occurred.
The temporal uncertainty of three crime types (residential burglary, robbery, and VWI) is shown in Figure 1. These crime types were chosen as they represent known priorities for the police during the study period and demonstrate the differences between person-targeted crimes (robbery and VWI) and property-targeted crimes (burglary). The time at which person-targeted crimes occurred is generally known quite precisely; on the other hand, the exact timing of a substantial proportion of burglaries is uncertain.

Measuring crimes within prospective boxes
The location of each crime is uniquely defined within the data set; however, the precision with which each location is given is not consistent. The most precise geocoding level is defined by address point data, and 72.1 per cent of MOPAC7 crimes within our data set fell within this category. Also, 22.5 per cent of crimes were only coded at the postcode level, which is less accurate, having a resolution of up to several hundred metres. The remaining 5.4 per cent of crimes were either coded based on other location information (such as road junctions or train station), to street level, or their geocoding precision was unknown. Again, the level of precision varies by crime type, with 96.0 per cent of residential burglaries being coded to an exact address, but for VWI and robbery, the figures were 75.8 and 51.9 per cent, respectively. While the proportion of person-targeted crimes that are geocoded to the postcode level appears high, the authors have no reason to believe this is abnormal for these crime types; indeed, this highlights the challenge of trying to analyse crime data at high spatial resolutions.
These uncertainties associated with the timing and location of offences must be taken into consideration when interpreting the results of any study that focusses on such fine temporal and spatial resolutions, including this one. To avoid attrition in the data, in the analysis that follows all data were analysed, and the data set therefore contains 2,459 crimes. Of these, 108 were recorded as occurring within "live" prospective boxes.

Illustrative analytic strategy
Using the data described above, we attempted to estimate the amount of patrol dosage that was applied to each "box" and how much was necessary to effectively deter crime. Our unit of analysis was the "shift-box"; that is, each instance of a grid cell being identified as a prospective box in each shift during the study period. Where a location was identified for multiple crime types during the same shift, this was treated as only one shift-box to avoid double counting. The total number of shifts for the study period was 183. However, prospective box locations for five shifts were not archived by the police due to a technical error. As such, the data for these shifts are excluded from all analyses.  Of the 5,697 shift-boxes identified for deployment, 3,678 boxes (64.6 per cent) received an estimated dosage of zero. This is interesting, but perhaps unsurprising, given the quantity of boxes identified over such a short time frame. It thus appears that the resourcing required to cover the number of boxes identified at the frequency they were produced was too great. This also illustrates an important shortcoming of the "intention-to-treat" approach to evaluation discussed above, which would wrongly assume that all boxes received intervention. As illustrated in Figure 2, of those boxes that did receive dosage, few received more than one hour of estimated dosage in any given shift. Furthermore, the presence of many low (but non-zero) values in Figure 2 suggests that many boxes are likely to have received only inadvertent dosage from officers during the course of other duties.
Since the recorded crime data are subject to temporal uncertainty (i.e. "aoristic", as discussed in Section 2.2), an analytic choice must be made with respect to how the time of each incident is estimated. The impact of this choice is partly mitigated here since the analyses that follow are conducted at the shift level. In our analyses, we used the earliest recorded time to minimise the possibility that a crime was estimated to have occurred after a patrol when in fact it happened before. This approach guarantees that patrols will not be falsely recorded as having occurred before a crime, but does increase the risk that a crime is erroneously estimated as occurring before a patrol. This trade-off illustrates a further issue with evaluations of this kind and the data on which they are based.
For the purposes of this study, a randomized controlled trial was not possible, and hence a quasi-experimental approach was adopted. The approach taken here was to compare the count of crime in each box with a suitable control. Defining the set of control boxes was challenging, given the fact that the risk of crime is dynamic, varying in both space and time. For this reason, we estimated the effect of intervention by comparing the count of crime in a "live box" with the count of crime in that same location at an earlier time; in this case, one week earlier. The option of selecting the same box at an even earlier point in time (e.g. two or three weeks before) was not possible here due to the limited period for which data were available. Our design has the advantage of controlling for factors that vary spatially and over the course of the week, but it does not control for other factors. Chief among these is that a control box may itself have been a live box that received some patrolling. While removing such occurrences was considered and tested, this would have led to considerable attrition in the datawhich was already limited given the number of crime events examined (n ¼ 108). Furthermore, omitting such boxes would, in effect, remove any persistent hot-spots from the analysis and introduce another systematic issue. This highlights a further challenge to evaluations of this kind. Alternative approaches, along with their strengths and weaknesses, are considered in the discussion section. Figure 3 shows the cumulative count of crime observed in live boxes as a function of the dosage delivered in them. This enables us to see (for example) the total number of crimes that occurred in those live boxes for which up to 10 minutes of patrol dosage (in this case, 73 crimes) was delivered, and so on. The curve for the live boxes can then be compared with that for the (matched) control boxes. The difference between the two is also shown. The fact that the prospective boxes that received no (estimated) dosage during the live period had fewer crimes than during their corresponding control periods seems counter-intuitive. Given that the live boxes are anticipated to be at an elevated risk of crime during the "live" interval, the natural assumption would be that they would experience more crime in the absence of any intervention. There are, however, reasons why this expectation may be unrealistic.

Results
One explanation is regression to the mean; it is possible that there would be no sustained elevation in risk and the crime rate would naturally subside without any police action. A related explanation is that a selection effect is at play. That is, given that they could not visit every box, officers may have used their local knowledge and avoided patrolling those boxes that (they perceived) were unlikely to actually be at an elevated risk (those for which regression to the mean was likely). It is important to note that this kind of selection effect could equally occur in a randomised control trial and that this would go undetected for an evaluation that used an intention-to-treat design.
If the quantity of patrol dosage had no impact on the ensuing crime rate, the difference between live and control curves would be expected to remain approximately constant. As shown in Figure 3, this difference clearly increases between about 10 and 20 minutes of dosage, suggesting that boxes that received these amounts of dosage experienced substantially less crime than their control counterparts. This suggests that patrol dosage has a non-linear impact on crime and that a threshold minimum dosage is required for there to be a deterrent effect, as has been reported elsewhere (Koper, 1995). Again, this is important and would be missed in an evaluation that employed an intention-to-treat design. Given the issues of data uncertainty discussed, the relatively small number of crimes in prospective boxes (108), limitations associated with the evaluation design, and the aims of the current paper, we resist presenting detailed statistical analyses here. As such, the result is presented with the caveat that, whilst patrol dosages of approximately 10-20 minutes appear to have a substantial impact on crime rates, this has not been tested for statistical significance.

Discussion
Given a growing interest in hot-spots policing, prospective hot-spot techniques (Perry, 2013), and the availability of GPS data to evaluate such interventions, the aim of the current paper was to discuss some of the concerns surrounding the quality of GPS and police-recorded crime data to inform the design of future studies. To make the ideas concrete, these issues were illustrated using data from a real-world example. For the reasons discussed above, our aim was not to present an evaluation of the example used. However, it would seem amiss not to at least comment on the findings, however speculative they might be. Overall, the results presented chime with those of other studies (Ariel et al., 2016;Koper, 1995) and suggest that police foot patrols need to exceed a threshold of about 10 minutes to produce their intended effects and that after approximately 20 minutes, they have little further impact.
In terms of implementation practicalities, we found that many of the designated patrol boxes did not receive any dosage at all, something that an "intention-to-treat" study design would fail to uncover. As well as illustrating the need to explicitly measure dosage, this draws attention to the need for all implementations to carefully consider resource limitations.
We now turn to the implications for future research. While GPS data offer a more precise picture of where foot-patrol officers are at a particular time, ping rates are currently relatively infrequent and hence methods of interpolation are necessary to estimate officer paths. Using Euclidean paths will incur some inaccuracy, particularly in cities where potential paths are highly confined by the environment. Improving these estimates warrants greater attention in future work. The literature on wayfinding (e.g. Golledge, 1999) may provide a useful foundation for such research. Evaluators need to be transparent about the quality of the data used and methods of interpolation employed. They should also report the sensitivity of their results to different methods of interpolation, or variation in the parameters used to derive the estimates. As a minimum, they should report the ping rate for GPS data.
The use of GPS data also has drawbacks when compared to traditional methods such as the use of police logs or independent observers. For instance, whilst logs and observations can help measure what the officer is doing whilst at a location, GPS data cannot. To reduce uncertainty regarding officer activity, several approaches might be taken in future studies. For instance, officer dispatch logs might be cross-referenced against the GPS data to provide some information regarding officer activities. Furthermore, and looking to the future, the increasing ubiquity of internet enabled devices (including police body-worn video and other wearable devices) might provide opportunities to capture activity passively. In defence of the general approach taken here, there is a difference between what officers are doing and how they are perceived by those who might observe them.

PIJPSM 41,3
To conclude, research suggests that hot-spots policing interventions work (see National Academies of Sciences, Engineering, and Medicine, 2017). However, evaluations have rarely examined the relationship between patrol dosage and its impact on crime. Understanding this association is important if the police are to make the best use of the resources they have available. The findings of the current study are only speculative but they are at least consistent with the handful of studies that have looked at this issue in the past. They also highlight how intention-to-treat designs are likely to be inadequate for assessing the effectiveness as they assume perfect implementation rather than measuring the actual amount of resource allocated. To establish a more reliable evidence base, we will need more evaluations that make use of the kinds of data described here. Care will need to be taken when analysing those data and we encourage evaluators to be transparent about the limitations of their data, the methods they use to interpolate from GPS data, and the experimental design employed.