Chasing the dream of accurate and reliable low cost sensors in air quality
For quite a few years now, there are a number of projects around consumer level (as opposed to research grade) air quality sensors. (Air Quality Egg, Speck, Dustduino, Air Beam and our own PACMAN and ODIN). The goal of most of these projects has been to empower the general public about air quality and bring the problem closer to home so that people don’t see themselves as passive receptors in this but relevant actors who can have an impact through their individual actions.
The success of these initiatives has been varied with some developing a relatively large community (AQEgg), others with more institutional support (SPECK, Air Beam) and some with not so great results yet (TZOA). However, all of them eventually face the data quality monster and things can get very ugly very quickly if not taken seriously!
Good data … but good for what?
A question that’s always asked around low cost air quality sensors is “is the data good?” but that question in itself is useless. It needs a context, a purpose “is it good for what?” Data that’s good as a screening test may not be good enough to decide if to approve a road extension.
Let’s look at a couple of applications in a little more detail to illustrate the point.
Personal exposure
Personal exposure is the cumulative dose of pollutants that an individual experiences over a period of time. To be useful in quantifying my exposure, the sensor needs to be:
- Precise: a value of 10 today is equivalent to a value of 10 several days from now.
- Sensitive: Its response is not zero (or noise) for a typical day.
What about accurate? Well … it is not relevant. It actually doesn’t matter if your daily exposure is 100 or 10 as long as the instrument tells you if today’s is larger or smaller than yesterday. In this case and for an individual, the requirements on a sensor are relatively easy to achieve.
Comunity exposure
If we now have a bunch of people (a community) wanting to know of their collective exposure then we need to add a requirements to those listed before:
- Consistent: If one instrument reports 10, then any instrument will report 10.
Now we have a more stringent criterion. If I have 50 sensors distributed in my community and I want to know where do the more exposed people live or who is the most/least exposed to air pollution I need all the sensors to be comparable. This is particularly relevant for low-cost sensors where the inter-instrument variability in those sensors is generally large and there is limited testing done to the units in the production line.
Standard compliance
This is the most demanding use of any sensor because the values are used to decide on restricting people’s activities or on whether certain investments are desirable or not. For this we need to add another requirement to the three presented above:
- Accurate: This is the big one. If the air pollution is 10, then I need a sensor that reports 10, not 11 and not 9.
In this case, it is critical that the measurement I take is an accurate representation of the conditions so that if that measurement triggers a regulatory action, I can be certain that it’s not a “false positive”.
This is where most of the cost of air quality instrumentation comes from. In fact, the manufacturer needs to ensure a consistent production line with very little inter-instrument variability and on top of that it needs to be able to certify that the response of the instruments corresponds to what I need to measure.
So?
In conclusion, all data can be good but there are no perfect data and the difference between good and bad data is in its use.