Bildiri Özetleri
 Ana Sayfaya Dönüş

ISD Ana Sayfası

NEURAL NETWORK APPROACH FOR SMALL SCALE SOIL MAPPING

İlhami Bayramin
Ankara University, Agricultural Faculty, Soil Science Department, Ankara

ABSTRACT

Recent advances in space and computer technologies provide the possibility to process large amounts of data (multisource), not only spectral but also other data such as elevation, slope, aspect and relief about the Earth environment. In earlier studies, Neural Network (NN) methods showed great potential in pattern recognition for multisource, remotely sensed data. They are superior to statistical methods in terms of classification accuracy. NN models have the advantage of being distribution-free, and they avoid the problem of determining the influence of the sources in a multisource data setting. New advances in NN methods facilitate solving large scale problems for which previously the computational complexities of the training methods were prohibitive. In this research, digital terrain data and Advanced Very High Resolution Radiometer (AVHRR) 10-day composite data acquired in 1992 and 1993 over Indiana and Illinois were used to test the ability of NN methods for small scale soil mapping.

INTRODUCTION

Earth observation sensors are producing data with great potential for use in scientific and technological investigations in very large and ever increasing quantities (Atkinson and Tatnall, 1997). Artificial Neural Networks have been used for the remotely sensed imagery classification since late 1980's (Kanellopoulos and Wilkinson, 1997). Neural networks have been trained to perform complex functions in various fields of application including pattern recognition, identification, classification, speech, vision, and control systems. Today neural networks can be trained to solve problems that are difficult for conventional computers or human beings (Demuth and Beale, 1992).

Recently, there has been a resurgence of research in neural networks. Neural network models have an advantage over the statistical methods in that they are distribution-free, and no prior knowledge is needed about the statistical distributions of the classes in the data sources in order to apply these methods for classification.

The neural network methods also take care of determining how much weight each data source should have in the classification. A set of weights describes the neural network, and these weights are computed in an iterative training procedure. On the other hand, neural network methods can be very complex computationally; A lot of training samples is required for successful applications, and their iterative training procedures usually are slow to converge. Also neural network methods have more difficulty than do statistical methods in classifying patterns which are not identical to one or more of the training patterns. The performance of the neural network models in classification is therefore more dependent on having representative training samples, whereas the statistical approaches require an appropriate model of each class (Benediktsson et al. 1990).

LITERATURE REVIEW

In remote sensing, classification is one of the main applications of neural networks. Howald (1989), McClelland et al. (1989), Hepner et al. (1990), Downey et al. (1992), all applied neural networks to classify land cover from Landsat Thematic Mapper (TM) imagery and all found varying degrees that the neural approach was more accurate than traditional statistical classification. Compare to conventional statistical methods, neural networks has significant advantages for the multi-source data classification. Benediktsson et al. (1990) successfully applied neural networks on the integrated data set of Landsat MSS imagery and topographic data sets (elevation, slope and aspect) for the land cover classification. Similar study was carried out by Peddle et al. (1994) to classify land cover in Alpine regions with multi-source data. Ersoy and Hong (1990) studied new neural network architecture called the parallel, self- organizing, hierarchical neural network (PSHNN). Multisource data set from mountainous area in Colorado, Landsat MSS data and DEM data (elevation, slope, aspect) were used to address some important problems with NN such as network complexity, learning and recall times, fault tolerance, and quality of generalization. They reported that one of the attractive properties of PSHNN is error detection at the end of each stage neural network (SNN). This would make possible the avoidance of back-propagation errors stage to stage to learn weights, the avoidance of the requirement for differentiable and invertible non-linearities, faster learning time, since few training vectors are utilized in later stages, parallel operation of SNN's during testing, real time adaption to nonoptimal connection weights by adjusting the error detection bounds, and thereby achieving very high fault tolerance and robustness.

MATERIAL AND METHODS

DATA SOURCES

The study area selected for this research includes central and southern Illinois and Indiana. Advanced Very High Resolution Radiometer (AVHRR) data, collected by NOAA satellites and digital topographic data derived from Digital Elevation Model (DEM) were used in this research. The AVHRR sensor on board the NOAA-10 satellite provides global multispectral coverage in five spectral bands; 0.58-0.68 mm (channel 1), 0.72 -1.10 mm (channel 2), 3.55-3.93 mm (channel 3), 10.50-11.50 mm (channel 4) and 11.50 - 12.25 mm (channel 5). In addition to the 5 spectral bands a Normalized Difference Vegetation Index (NDVI) data set was produced and added as a 6th band. The 3-arc-second DEMs available through the USGS Earth Resources Observation System (EROS) Data Center provide a comprehensive coverage for the entire US. These elevation data are at a scale of 1:250,000 (cell resolution of ~100m) and are available over the Internet at no cost. In this research the 3-arc-second DEM data for Illinois and Indiana were downloaded from Internet. In this research two different sets of reference soil maps (Soil Region Map and Major Land Resource Areas, adapted from STATSGO) were used.

METHODS

DATA PREPARATION : From the files of cloud free data, three sets (1-10 April 1992, 21-30 July 1993, 21-30 September 1993) of AVHRR 10-day composite data were down-loaded from the Internet. Data over the central Midwest of the USA (including Indiana and Illinois) registered to Albers Equal Area map projection systems. DEM derivatives of Slope (the rate of maximum change in average z value from each cell to its neighbours), Aspect (positive degrees from 0 to 360, measured clockwise from the north), Relief (the range of elevation) and Landform Class (Hammond Approach) Layers were produced using ArcInfo/Grid software.

Two different sets of reference soil maps, Soil Region Map and Major Land Resource Areas derived from STATSGO were prepared. The soil maps for STATSGO are compiled by generalizing more detailed soil survey maps. This data set is a digital general soil association map developed by the National Cooperative Soil Survey. In this research STATSGO data for Illinois and Indiana were downloaded from the Internet at no cost. Map units of STATSGO data with 200m spatial resolution were resampled to 1000m spatial resolution and reclassified (merged) into 13 classes according to their MUID (map unit identification) code to prepare an MLRA map of Central and Southern IL and IN as a reference map. In this research in addition to the unified soil region map of Illinois and Indiana, soil region maps of Illinois and Indiana were prepared separately. STATSGO map units were reclassified (grouped) to prepare the Soil Region Reference Map of Indiana and Illinois. These maps have not been digitized. But for this research STATSGO map units were reclassified based on information from these general maps.

DATA INTEGRATION : After the resampling of all data sets to the same spatial resolution (1000 m ) and same map projection (Albers Equal Area Map Projection System), all AVHRR 10-day composite data and DEMs (aspect, DEM, relief, slope, landform) were integrated into a new data set.

ARTIFICIAL NEURAL NETWORKS : The Back Propagation learning rule was used in this research. Back propagation was created by generalizing the Widrow-Hoff learning rule to multiple layer networks and non-linear differentiable transfer functions. In order to achieve Neural Network Classification, MATLAB Neural Network Toolbox was used in this research. Two reference maps were tested MLRA and Soil Region Map of IL-IN and two different training sampling strategies were used; 1 - Non-Random signatures; and 2 - Random signatures.

ACCURACY ASSESSMENT : At the completion of the classification exercise it is necessary to assess the accuracy of the results obtained. This will allow a degree of confidence to be attached to those results and will serve to indicate whether the analysis objectives have been achieved. Accuracy is determined empirically, by selecting a sample of pixels from the thematic map and checking their labels against classes determined from reference data. Often reference data are referred to as ground truth. From these checks, the percentage of pixels from each class erroneously labeled into each of the other classes is calculated. These results are then expressed in tabular form, often referred to as a confusion or error matrix. Classification for training samples and for image data for all images were calculated and represented in this research. Since it is recognized that the reference maps represent at best approximation of truth, the use of terms "error or accuracy" often lead to misunderstanding or misinterpretation of classification results. For this reason, "agreement" was used instead of "error or accuracy" in this research.

RESULTS and DISCUSSION

One of the main objectives in this research is to test the ability of NN methods for small scale soil mapping using digital terrain data and Advanced Very High Resolution Radiometer (AVHRR) 10-day composite data. One of the main tasks for the classification was choosing small-scale reference map(s). For this reason, only available digital data, STATSGO, was used to generate small-scale soil maps. Two different sets of reference soil maps, Soil Region Map and Major Land Resource Areas derived from STATSGO were prepared for both states.

Neural network classification results of integrated imagery (AVHRR 10-day composite data set and topographic data set) for both reference maps of the MLRA and Soil Region Maps will be discussed. Two different training sampling strategies were used; 1 - Non-Random signatures; and 2 - Random signatures. First, training samples were tested many times to get highest training pixel classification accuracies, latter, these training samples used for image data classification.

RESULTS

Overall Training Pixel Classification Accuracies (TPCA) and Image Data Classification Agreements (IDCA) for both study (reference maps of MLRA & Soil Region) were presented in Table 1 and 2. In the first step MLRA reference map was tested to compare AVHRR data and integrated data of AVHRR and topographic data set. The highest training pixel classification accuracy (62.1 %) was observed with 10-day composite data set for the non-random (supervised) sampling. For the image data classification, 10-day composite data integrated topographic data set (41.9%) with random sampling gave the highest accuracy value.



After getting increasing values, integrated data set of satellite and topographic data was used for the Soil Region study. In addition to topographic data set effect of Hammond landform classification was tested. The highest TPCA (62.0%) were found in 10-day composite data integrated with topographic data (10d+adrs) for non-random treatment sampling algorithm. The highest IDCA (44.2%) was observed in the 10d+adrs (random) band combination. In general supervised classification gave higher results than random sampling algorithm.



Class weights (coverage %) were important factors for TPCA and IDCA results. Error Matrices were very useful to evaluate study results. IDCA error matrices of the 10d+adrs (R) image data set is presented in Table 3.



IDCA error matrices of 10d+adrs(R) image data set reveal that IDCA were 0.0% for 9 classes out of 18 class. Classes of 1, 2, 3, 10, 12, 14, 16, 17, and 18 were resulted in 0.0% IDCA. Except for Class3 (14.7% area coverage), all of the classes (1, 2, 10, 12, 14, 16, 17, and 18) cover only 6.3% of the study area. Class 4 with the largest study area coverage (27.3%) gave relatively high IDCA (63.5%). IDCA of 4.5%, 25.1%, 1.0%, 1.3% were very low for Class 5, 7, 9 and 15 which occupy 11.1% of the study area. Classes of 8, 11 and 13 gave very high IDCA (68.2%, 83.5% and 89.3%). Misclassified pixels were generally in class 4 (27.3%), class 7 (5.9%), class 8 (6.3%), class 11 (14.6%) and class 13 (4.8%).

DISCUSSION

The major objective of this study was to assess the utility of satellite multispectral images and digital topographic data as supplemental tools for the generation of small scale soil maps and recommendations for the possible improvement of existing small scale maps. The approach in this study was to use a variety of existing maps which represent in some way generalized soil patterns and properties in the states of Illinois and Indiana. These generalized representations or maps of the soils of Illinois and Indiana were used in this study as the standard or "ground truth" with which to compare quantitatively maps or images derived from the integration and classification of satellite images and topographic data. These maps which were used as the standard for comparison included were Soil Associations (STATSGO), Major Land Resource Areas, and Soil Region Maps.

The approach and the attempt to integrate and compare a broad range of disparate sources of spatial data are fraught with difficulties and sometimes questionable assumptions. Even so, some interesting observations and results were obtained, and much was learned in this research effort about preparation, manipulation, analysis and interpretation of spatial data sets. Although sources of error cannot always be identified, even when these sources are identified it may be impossible to quantify the amount of the error attributable to a specific source. Some of the known errors inherent in the data source or errors introduced in the preparation of data lie in the registration or co-registration process. Errors are introduced when 100 m DEM data are generalized to 1000 m data. Information is lost and/or noise is introduced in the process of merging extremely small classes with larger classes for reclassification.

Date of collection of remotely sensed data for optimal identification and delineation of meaningful variations in soil patterns and conditions can have dramatic effects on classification results. How can one interpret or evaluate quantitatively the differences in classification results from multispectral data collected over the same area but on different dates and seasons? Even with all these temporal, spectral and spatial variables, it was possible to draw some conclusions and observe trends in relationships between classifications obtained from the combinations of spectral and topographic data and existing soil maps.

The soil association map was derived or adapted from the STATSGO map. Even though the same guidelines were used in preparing STATSGO maps of Indiana and Illinois, maps of the different states lack uniformity, and soil unit boundaries at the state boundary often appear unnatural. In several cases the map unit lines near the state boundary appear to be brought to closure artificially rather than to cross the state boundary as it appears they should. The density of STATSGO map units in Indiana appears to be significantly higher than in Illinois. This might explain why quite different "accuracies" are found when a supervised classification of data (satellite and topographic data) for Indiana and Illinois are compared with a combined (both states) soil association (STATSGO) map. When the same study is made for each state separately, the "accuracy" of the classification for Illinois alone is significantly higher than for the combination of the two.

The scope of this study did not include an evaluation of the accuracy of existing maps which were used as the standards for comparison with the classifications of the study areas. For the purposes of this study, the existing maps were assumed to be "truth." Everyone who generates any natural features of the earth surface--soils, geology, vegetative cover--knows the perils of trying to represent those features accurately on a map, of assuming the map to be truth. At best any map is an approximation of what the mapper believes his/her observations to represent. The statement that a supervised classification of a spatial data set (digital satellite image, DEM) is 61% accurate suggests that the existing map is correct or right and the classification is only fair or wrong. In reality, the classified image may be a better representation of the soil patterns and variations in soil properties and conditions than is the existing map. Rather than express the difference between the existing soil map and the classified image as "classification accuracy," it would be more appropriate to express this difference in "percentage of pixels which are in agreement." Then to evaluate these results, it would necessary to conduct a statistically sound field study to verify or assess the quality of both the existing map and the classification. Neural Network (NN) methods showed great potential in pattern recognition for multisource, remotely sensed data and satisfactory results were observed. They are superior to statistical methods in terms of classification accuracy. NN models have the advantage of being distribution-free, and they avoid the problem of determining the influence of the sources in a multisource data setting. NN methods have the advantage to facilitate solving large-scale problems for which previously the computational complexities of the training methods were prohibitive. Very long processing time was one of the main disadvantages of the NN methods. On the other hand, they have to be trained very well to get higher accuracies.

The mapping of soils is an expensive and time-consuming enterprise. During recent years, the emergence of a broad array of new sensors and earth observation tools, access to remarkably accurate and reliable global positioning systems (GPS), and the availability and use of GIS technology offer to the soil science community a previously unavailable set of tools for the generation of better, more accurate, and more accessible spatial soils information systems. However, there is much research and development and education yet to be done before the integration and use of these new technologies in soil classification and survey become a reality.

REFERENCES

ATKINSON, P. M. and Tatnall, A. R. L., 1997, Neural networks in remote sensing. International Journal of Remote Sensing, 18, 699-709.

BENEDIKTSSON, J., PHILIP, H. S., and OKAN K. E., 1990. Neural Network Approaches Versus Statistical Methods in Classification of Multi-source Remote Sensing Data, IEEE Transactions on geoscience and remote sensing, 28:4, 540-551.

DEMUTH, H. and BEALE, M., 1992, Neural Network Toolbox User's Guide. The Math Works Inc. 24 Prime Park Way, Natick, MA 01760.

ERSOY, O. K., and HONG, D., 1990, Parallel self-organizing, hierarchical neural networks. IEEE Transactions on Neural Networks, 1, 167-178.

HEPNER, G. F., LOGAN, T., RITTER, N. and BRAYANT, N., 1990, Artificial neural network classification using a minimal training set: comparison to conventional supervised classification. Photogrammetric Engineering and Remote Sensing, 56, 469-473.

HOWALD, K. J., 1989. Neural network image classification. Proceedings of the ASPRS-ACSM Fall Convention (Falls Church, VA: ASPRS), 207-215.

KANELLOPOULOS, I. And WILKINSON, G. G., 1997, Strategies and best practise for neural network image classification. International Journal of Remote Sensing, 18, 711-725.

McCLELLAND, G. E., DeWITT, R. N., HEMMER, T. H., MATHESON, L. N. and MOE, G. O., 1989, Multispectral image-processing with three-layer back-propagation network. Proceedings of the International Joint Conference on Neural Networks, 1 (New York: I.E.E.E.), 151-153.

PEDDLE, D. R., FOODY, G. M., ZHANG, A., FRANKLIN, S. E. and LEDREW, E.F., 1994, Multisource image classification II: an empirical comparison of evidential reasoning, linear discriminant analyses, and maximum likelihood algorithms for alpine land cover classification. Canadian Journal of Remote Sensing, 20, 397-408.

Sayfa Başı