Through this blog series we have covered a step by step approach to identifying the correct connectivity networks and tech for your farm (SEE: https://pairtree.co/2019/06/03/where-do-i-start-with-farm-iot-internet-of-things/ ) and also the first contemplation of what is needed within the Australian Agtech ecosystem to allow clear and transparent data sharing. (SEE: https://pairtree.co/2019/12/10/farm-data-gaps-black-holes-or-gold-mines/ )
As the old Castrol GTX advertisment said in 1988, “Oils ain’t Oils” and so nowadays “Farm data ain’t Farm data.” (See below for YouTube links).
For Pairtree our business is Connecting farm data from across the farm, supply chain and Nation, every day we see new and established digital agriculture businesses and their data. The underlying role of Pairtree Intelligence is to Normalise and Standardise the data that we connect to, allowing data aggregation, that is capable of then providing clear and reliable insights to farmers from disparate data sets.
The quick and ‘knee jerk’ response from Industry leaders is that ‘There should be a data standard!’, (Which I agree with and hope one day we can achieve), BUT time, money and also commercial capacity will slow this anticipated process. Thus minimising any immediate capacity to utilise on-farm and supply chain data, without an alternate approach.
With both ends of the commercial digital agriculture spectrum, ‘Established businesses’ and ‘Startups’, both will struggle to move from their existing database schema and ‘inputs’ naming convention to a new standardised convention. For example these existing businesses may have been recording many years worth of data, that is suddenly now not in an accepted format. There is little incentive to develop these changes, when there is greater demand for new services or sales. For a startup that firstly doesn’t know the industry and is trying to provide new services, the methodology of a Minimal Viable Product (MVP) really challenges any potential adoption of a ‘Standard’. A MVP is essentially the crudest (Minimal) in-field testing option to identify the market gaps, this process pretty much dismisses any additional consideration other than to ‘solve the problem’ and to deliver new/ improved services. Both these scenario’s are both real and causing impediments to Australian Agriculture and its potential to gain further insights into data for productivity, efficiency and stewardship, through efficient data transfer.
A basic example of Farm data standardisation failure is with the range of Internet of Things (IoT) sensors, we have worked with. We have seen Temp, Temp., Temp Deg, Temp deg, Temperature, Temparature (misspelt) and many other derivatives of temperature. Firstly, is it soil temperature or air temperature (the device manufactures assume that you know the device and aren’t considering data aggregation options in the future). Secondly whilst the human mind can see and translate all of those names into temperature, a machine/ computer can’t read unknown/ new concepts, thus there is a process of then documenting and aligning these ‘inputs’ and any new options that are outside of the current known range, must be aligned.
So now understanding some of the issues regarding the complexity of attempting to standardise farm data, we can now look at the complexity of aggregating data. There is a number of issues that can result where good data can be either corrupted, ill formatted or lost, which is unfortunately very common.
We have spoken to many clients about their data storage, which they have either multiple USB sticks or a Dropbox type system, which is good in respect that data is saved, but far from being accessible and usable. Bulk storage (without a database) is akin to your pantry, where initially you take great care in the process of packing the shelves, but when others put items in or if there is bulk inputs, many times the original filing system starts to be compromised. The held data then is only as good a it’s accessibility and consistency and if there is components of that data, held in different folders, it is essentially lost.
Overtime with changes to technology and innovation there are also changes to file/ data formatting, which needs to be cleaned or provisioned to ensure that it can align and enrich the new data assumptions. It is now growing evermore critical that data is held somewhere that can service the current and future needs of digital agriculture, to ensure that there is a critical level of data that can then be used to inform the learning program of Artificial Intelligence and Machine Learning. Both these processes need wide ranging and massive amounts of cleaned and standardised data to then start to identify trends and potential opportunities or threats.
A very simple and frustrating issue that many farmers routinely face is when supplier X’s weather station has been collecting weather data for 5 years and then dies and is no longer capable of collecting data. The farmer then has to quickly purchase a new device from supplier Y’s, where there is a loss of data continuity. This is very frustration when the new device (Y) is then only reporting on its known data range and not the entire data set for that site. ie: Currently there isn’t a way between the two apps to align the data for the same site and so accessing that data will always be a manual two step process.
There are many other ways that data can be your farm data can be corrupted and lost, so what options are there at the moment to start to improve these issues?
The first option which is the cheapest and most simplest option is to standardise your own data. How can I do that, I hear you say? Whilst it sounds hard it is relatively simple. From now on work out some protocols about how you collect and or input ALL OF YOUR DATA into apps or for IoT providers. This is simply to minimise your data ending up like the above Temperature example. ie: Always use a capital (Or probably easier not to) at the start of paddock or other variable, always use (or probably don’t use) abbreviations Pd, always keep a particular pattern for random data sets or variable, for dates ddmmyyyy is probably good, but won’t show chronologically thus yyyymmdd is worth considering and so on. EG: House_Paddock or house_pd, 2017_Black_Steers or 2017_black_strs, House_Paddock_2017_Yeild_Map
This then will make data mapping easier for tech companies like Pairtree to ensure your data aligns more easily and effectively. Data centralisation platforms like Pairtree are really the only other option at present, to start to aggregate and standardise your data from across the farm.
With any of your data that is currently being collected, there may be a need to change providers overtime and the issue of whether that data can be extracted and aligned with other data is an unknown. Hence for flexibility and security of your data aggregation and the ability to effectively utilse critical data-sets into the future it is really important to start to consider options now.
Currently we feel that the only scalable option to centralise data is for Pairtree to utilise international frameworks (particularly the OGC (supported by CSIRO) https://www.ogc.org/ ) and then to underpin the mapping of all data-sets that we connect to. This is very time consuming, but over the last two and half years we have built up a library of 60 Agtech and digital agricultural service. Wehave now been exposed us to a wide range of the complex and dynamic variables that farmers, agronomists and solution providers collect to provide productivity insights and gains. This library approach allows us to then internally standardise the data, rather than forcing established companies to ‘STOP’ and review their approach. We feel that for the gain of all farmers and solution providers, this approach complements their adoption and development pathways, because there is, and will always be several standards both here (in Australia) and abroad.
If you like what you have read or wish to ask any questions, please feel free to contact me. (Apologies in advance for my poor grammar, as I am only a farmer trying to deliver an essential service to Australian Ag. and provoke thought into this and other critical data conversations)
Stay safe and free from COVID.
To re-live your youth or to see a great old Aussie Ad, click below