This was a project for the New Zealand Fire Service carried jointly with Frances Sutton and Angela Pidd. The objective to relate fire risk in dwellings (houses and flats) with various social, demographic, and meteorological characteristics of a district. This was to help target a fire prevention campaign as well as simply help understanding.
The Fire Service collects data on every callout regarding details of a fire such as where it started, suspected cause, extent of the damage, injuries and fatalities. It does not collect any social or demographic details about the people involved. So one can't find the required data just from the Fire Service database.
Our approach was as follows.
We divided New Zealand into 400 first response areas corresponding to the areas served by each fire station. Then we used data from the Government Statistics Department to get population, average income (the data automatically truncates high incomes, so average income gives a useful measure), percent people on a social welfare benefit, percent unemployment, percent home ownership, average occupancy, percent children, percent old people, percent in various ethnic groups. We also got temperature and rainfall data from the meteorological office.
The main analysis was carry out a step-wise regression analysis using the glm routine in S-plus with the quasi model (essentially a Poisson model, but allowing for extra variance). The Y variable was the number of structural fires in each area and the X variables were selected from the demographic, social and meteorological variables. Population was included as an offset since the number of fires should be approximately proportional to population. Further exploration was carried out with the gam routine which attempts to fit non-linear relationships.
The X variables included by the analysis included
The signs show the direction of the effect. Owned, benefit and then temperature were the strongest of these effects.
The population effect was in addition to population included as an offset. Districts with very small populations have unexpectedly more fires. We suspect this is due to the isolation of the houses so that fires tend to go unnoticed and so are more likely to become structural. Response time of the Fire Service may also be a factor.
Not unexpectedly there are more fires in colder places.
Again, not unexpectedly, there are more fires in districts with a large percentage of rented accommodation.
Percent people on a benefit is taken as an indicator of a low income district. There are more fires per person in districts with a high percent of people on a benefit. (This should not necessarily be interpreted that the benefits or the people on the benefits cause fires - rather that low income is associated with higher fire risk.) Percent people on a benefit is highly correlated with percent unemployed. However, percent people on a benefit, gave a much better fit.
The relation with average income was strange and seemed in conflict with the relation with percent people on a benefit. However if one does a gam analysis with just income as an X variable one gets a U-shaped curve; both very high and very low incomes are associated with higher fire risk. Further analysis shows that the high income fires tend to be equipment caused fires so it makes some sense - richer people have more equipment.
Occupancy is hard to interpret. Higher occupancy is associated with younger populations, more children, lower incomes, possibly newer suburbs. It is impossible to dissociate these effects. We are using population as the primary predictor of the number of fires. For some fires (eg kitchen fires) the number of dwellings might be a better predictor. This will automatically lead to higher people per dwelling being associated with lower fires per person.
I also repeated the analyses when one looked at only certain kinds of fires. The reduced amount of data meant it was difficult to get unambiguous results. However one can come to some general conclusions:
I think we found useful qualitative results. I think most of things we found were obvious, but I am not sure that they were all obvious to our clients before we started. It was difficult to get satisfactory quantitative results in a form our clients could use. I tried to tackle this by asking questions like this: if we reduced the fire-risk in districts with 20% of the population receiving a benefit to that of districts with 2% receiving a benefit, how many fires would we prevent (the answer is about 30%). But I don't regard these answers as very satisfactory or really credible as yet.
I was able to carry this project out on my 486 PC using S-plus for windows.