Self-Organizing Maps

Summary	Classification of multi-variate data using Self-Organizing Maps (SOM), an AI technique. A Geosoft Python script that runs from Oasis montaj and applies the technique to a Geosoft database is part of this project.
Researchers	ian macleod (Deactivated) Rob Ellis (Deactivated) @Telma Aisengart
Github	https://github.com/GeosoftInc/Geoscience/tree/master/Open%20research/self_organizing_maps
Publications	Quantitative Magnetization Inversion Self-Organizing Maps applied to Magnetization Vector Inversion Application of SOM technique to the Magnetization vector
See also	Cracknell, 2014, a SOM analysis of Australia

Installation

Install Oasis Montaj.
Install Python (see https://github.com/GeosoftInc/gxpy/wiki/Python-menu-for-Geosoft-Desktop)
Install required Qt packages:
```
pip install ipykernel
pip install pyqt5
```
Copy *.py files from https://github.com/GeosoftInc/Geoscience/tree/9.4.0/Open%20research/self_organizing_maps and place in your Geosoft user/Python folder (C:\Program Files\Geosoft\Desktop Applications 9\user\python)
Open your Oasis montaj project, create a Geosoft database with multivariate data, display the data channels to be analysed, and only those channels.
Run the som_om.py file (Run a GX, change the type to *.py, navigate to C:\Program Files\Geosoft\Desktop Applications 9\user\python and select som_om.py)

Usage Notes

This GX analyses multivariate data by grouping data into statistically meaningful groupings using SOM neural-network analysis. The technique is described here:

http://en.wikipedia.org/wiki/Self-organizing_map

Up to 16 separate data channels can be analysed. The first three channels are used to colour-code the resulting SOM neural network report. The report provides a simple visual impression of the classifications.

The size of the neural network is the square-root of the number of classifications, which can be any of 4,6,16,25,36,49,etc… Two new channels are added (or replaced) in the database:

Class – the classifications, 0 to number of classifications-1EuD – Euclidian distance of the point from the assigned neuron class, which is the closest based on Euclidian distance.

If an anomalous percent above 0 is specified, the percent of data that is furthest from the network (by Euclidian distance) is re-classified a second time on its own into a second SOM network of the same dimension such that there will be twice as many classes. For example, choosing 16 classifications and 5% anomalous data will result in 32 classes (16+16) where classes 0 to 15 capture the 95% of data that most closely fits the original network, and classes 16 to 31 are the most extreme 5% reclassified in their own network.

To work with gridded or voxel data:

Export a grid or voxel to a new database.
Sample other grids/voxels into new channels of the database.
Run the SOM analysis.
Save the "Class" channel as a grid or voxel.
Imagination encouraged!

For example, you can classify MVI results based on the x,y,z vector directions and the amplitude, which kind of double-weights the amplitude in the analysis. Because the directions can range + or -, the amplitude is scalar, and all dimensions are likely reasonably and meaningfully scaled relative to each other then no normalization is necessary. The process would be:

Create a database from the amplitude voxel.
Sample the x, Y and Z directions into mx,my,mz channels.
Run a SOM, dimension 9 (or 16), anomalous percent 5.
Convert Class channel to voxel and view.
Clip the data to above 9 (or 16), and this is the SOM of the anomolous locations.