Monday, 23 April 2018

The problem with Average Length of Stay


Back in the 1970s, before patient based data sets were available electronically,  a limited range of aggregate data was collated from paper forms collected by hospitals. There were were no separate provider organisations in those days. Hospitals were organised into 'Units' within District Health Authorities. Data was collected and analysed within Districts as well as submitted nationally to the NHS through a series of 'central returns'. Some detailed analysis was possible through the collection of samples, but the burden of producing this was quite onerous.

With ingenuity, some performance indicators were devised from the routine aggregate data. These included  'turnover interval' and  'average length of stay'. Both could be calculated from the routine aggregate data, such as occupied beds days, available bed days and number of discharges. They allowed comparison to be made between different hospitals, or changes in one hospital to be followed over time.

These measures were typically used at the level of  a whole hospital over a whole year. They worked reasonably well at this macro level. They also worked reasonably well for the typical pattern of healthcare delivery in the 1970s and 1980s

The pattern of hospital stays has changed a lot in the meantime. The number of beds has more than halved. And the number of admissions has risen. There is extensive use of day case admissions, particularly for surgery. The number of unplanned  admissions has risen, particularly for older people. In the past a period of convalescence after illness or surgery was seen as a good thing. These days stays in hospital are kept to a minimum both for health reasons -  and to free up precious beds.

Turnover interval is rarely talked about today. In contrast, average length of stay occurs extensively in both performance monitoring and planning. It seems to be an instantly attractive concept which is often misunderstood and misused.

The article highlights some important features that you need to understand if you are going to use average length of stay data.



Point 1: Spell Length of Stay is an Integer

Average length of stay is usually presented to at least one decimal place, often more e.g. 4.6 days, 4.63 days. This looks very precise. It hides the fact that, contrary to what some people conclude when they see it,  this is an average calculated entirely from integers. It is not the average of the actual lengths of stay. Every spell length is rounded up or down to a whole number of days. The average is then taken of these modified figures.

The reasons for this are historical. When it was first calculated, average length of stay was

          Total occupied bed days  
           divided by  
          Total deaths and discharges
This continued into the first electronic patient based data (Hospital Episodes Statistics - HES data) in which dates were collected but times were not. So the calculation of discharge date minus admission date also resulted in spell lengths that were integers

It would be possible nowadays to calculate spell lengths in hours, or even minutes. But the method has not been adjusted. So even though spell lengths now are predominantly short, we are continuing to work with a measure which has an inbuilt loss of precision, particularly for short spells.


Point 2: Midnights

It should be clear from the section above, that spell length basically means 'how many midnights was the patient in hospital for'. This can result in some serious distortion.

Imagine two hospital stays.

Stay One: the patient is admitted at 11:55 pm and is discharged ten minutes later at 00:05 am. In this case the spell length would be calculated as 1 day

Stay Two: the patient is admitted at 00:05 am and is discharged at 11:55 pm the same day. The hospital stay is ten minutes short of 24 hours. In this case the spell length would be calculated as 0 days

If we were to calculate spell length in minutes we would end up with
Stay One: 10 minutes
Stay Two: 1430 minutes

Stay Two is 143 times longer than Stay One, but the 'official' Spell Length of Stay comes out back to front with Stay One as one day and Stay Two as zero days.



The illustration above was obviously taken to the extremes to make the point. But this distortion is significant.

Point 3: Skew

When dealing with an even distribution pattern (the so-called 'normal' distribution) the average (arithmetic mean) works well in providing the 'typical' value. When dealing with a significantly skewed distribution, the mean is no use. Generally the median is better for skewed distributions. Even this becomes increasingly  limited if the amount of skew increases.

.

Excel provides a SKEW() function to allow an easy calculation of the extent of skew.

The rule of thumb used by statisticians is is where skewness is outside the range 1 to -1 then it is significantly skewed. So do not use the mean

The distribution pattern for LOS is heavily skewed. The following illustration uses some real LOS data for Emergency Admissions. The Skew is 5.3. The average of this data is shown by the red line:



Who in their right mind would think that the average,   as shown above,  is a meaningful or stable way to represent this heavily skewed distribution?

Average Length of Stay continues to be used widely in the NHS;  median rarely



Point 4: Bi-modal and Poly-modal distributions

If a service contains two or more distinct patient groups or treatment regimes which require  different times in hospital then it is quite likely that the frequency distribution will have more than one mode (or 'peak')

For these kinds of distribution, average is especially unsuitable.

The published information on PbR trim points (see below) shows that there are many different expected stay lengths within non-elective admissions.

A Google or journal search will find plenty of examples of  bimodal length of stay distributions

e.g. Baker et al (1996) Degree of burn, location of burn and length of hospital stay as predictors of psychosocial status and physical  functioning. J. Burn Care Rehabil 1996 Jul-Aug: 17(4): 327-33

e.g. https://www.nature.com/articles/pr1985688 which includes the quote:
 'When the distribution of length of stay is bimodal, as it is for the highest risk and smallest infants, then the geometric mean is a poor measure of central tendency.' 


Point 5: Average Length of Stay only looks at finished spells

We need a spell to end before we can count the spell length of stay. Average length of stay for any time period only includes spells which finish within that time period. The bed use in unfinished spells is not reflected in the calculation.


Point 6:  Average Length of Stay measures activity which is outside the period in question

It is probable that some spells ending in the period in question will have started before the period. The shorter the period being measured, the more likely this is. The shorter the period in question, the further away from it the start dates of some spells are likely to be i.e. the more the activity relates to a period other than that being reported in

Point 7:  Average Length of Stay becomes increasingly volatile as span is reduced

 Average length of stay may be a satisfactory measure for a whole hospital over a whole year. It works less well over smaller spans. The more the level of either service scope  or time span is reduced, the less effective the measure becomes.

Breaking down the view from whole hospitals down to to individual wards, or individual doctors, or breaking down the view by time (from years down to individual months or weeks) and so on makes the calculated values increasingly volatile. i.e. they jump wildly about or have periods of no data

Point 8: Stay Length depends on Day of Week


The number of days that a patient stays in hospital is affected by the day of the week they were admitted on.

Point 9: Length of Stay is treated differently in PbR

Payment by Results (PbR) is the NHS national payments framework established in 2006 and modified most years since.  PbR uses an 'adjusted' Length of Stay for Acute inpatient spells. In essence the adjustment removes critical care days from the overall length of the spell

Point 10: PbR Trim points

Trim points are the cut-off number of days stay for specific HRGs (Healthcare Resource Groups) above which additional money can be claimed. The trim point is calculated as
the upper quartile length of stay for that HRG
plus 1.5 times the inter-quartile range of length of stay
This slightly strange looking calculation is actually quite conventional, being the upper limit of a Tukey boxplot i.e. defining anything greater as outliers.

Two things to note are: 
(a) that this NHS standard calculation does not use average lengths of stay. When it comes to something as important as money, average length of stay is not regarded as a reliable measure

(b) that in the annually published PbR lists, the number of days set as 'trim point' varies widely between different non-elective HRGs. In the 2017-18 lists, the longest trim point is 399 days (for AA61B) and the shortest is 5 days (shared by large number of HRGs) ; the median is 17 days and the average (for what it is worth) is 27 days. 

Point 11: Average LOS includes deaths

Finished spells includes cases where the patient dies in hospital as well as cases where the patient is discharged. In other words it averages out over both ameliorating and deteriorating patient pathways

Point 12: DTOC

Stay lengths will be inflated by the extent of Delayed Transfer of Care (DTOC). This includes where the delays are not officially reportable. 

Point 13: Data Quality

Time recording may not be reliable. Sometime this will be blatantly obvious, such as when discharges of Endoscopy Day Cases are recorded at 2:30 am. Often it will not be immediately obvious from the data alone. The effect of data quality errors will tend to be to extend the apparent length of stays



Point 14: Planning using averages can be very dangerous




















Point 15: Beware of summary statistics in general

Anscombe's Quartet provides a vivid example of how identical summary statistics can be created from widely different patterns of data. Relying on the summary statistics could easily lead to misleading conclusions

Point 16: Spell length of stay depends on provider organisational structure

Imagine a patient pathway in which a person spends 10 days in an Acute Hospital and then moves on to a Community Hospital for a further 10 days before going home.

If these are two separate Trusts then there are two spells each with length two days

If they are part of the same trust then there is one spell with length twenty days

Imagine further than after 3 days in the acute Hospital spell the patient had to be transferred to a regional specialist hospital returning after 4 days. Assuming the Regional Hospital is run by a different Trust then we have two Acute Spells of three days. So potentially the patient with higher acuity will appear to be presented by average LOS of 3 days while the lower acuity patient might be seen as  being reflected by an average LOS of 10 days