Home Tutorials Similarity Analysis

Similarity Analysis

A similarity analysis assesses the global context of a selected case based on one or more specified global variables. The analysis quantifies the differences between the case’s global variable value(s) and those of all other locations on the Earth’s land surface. These differences are presented as an index ranging from 0 (extremely similar) to 1 (extremely similar), which illustrated the places that are alike or different globally from the selected case.

What can you do with a similarity analysis?

Explore – real-time quantification and visualization of the global relevance of your case to other places over the Earth’s land surface.

Discover - find similar cases to yours based of selected global variables and connect to the case authors.

Demonstrate - generate a published report of the similarity analysis as evidence of the global relevance of your case to other places on Earth’s land surface.

Here is how to conduct a similarity analysis:

1. Select/View a Case-Study

Select a case from your own cases (‘My Cases’) or the GLOBE Cases list to view case details. In this tutorial, we will use a case named “Phongsaly”. Select the icon at the top left of the case view to conduct an analysis.


2. Specify the Global Extent

The first step in the analysis is to determine the areas of the globe that you want to consider in your analysis. You may or may not want to contextualize your case within all of the Earth’s land surface. For example, if your case studies deforestation in the tropics, you may not want to include areas without trees and/or outside of the tropics. To constrain the extent of your analysis, use the options under Analysis Filters. The area including in your analysis is colored brown, and as you add filters to the analysis, the brown-colored area will adjust accordingly. With no filters selected, the entire global land surface is included. For example, two predefined filters are offered for your convenience. The image below illustrates a filtered analysis extent based on the “Tropical” filter.


Alternatively/In addition, you can select and customize a filter based on the global variables available in GLOBE. Select “Add a new filter” and you will be presented with a selectable menu of global variables. Once selected, the variable will appear in a white box within the Analysis Filters panel. If the selected variable is continuous, you will be able to use the sliders to limit the variable range for analysis. If the distribution of selected variable is highly skewed – as it is for population density in the example below – the variable range can be entered manually by clicking on the blue label below the variable slider.


3. Select Global Variables

Select the variables that best characterize conditions at the case study site relative to other areas of the world. Select “Add variable” to view a selectable menu of global variables available within GLOBE. For a case study of a densely populated remote agricultural landscape in the humid tropics, one might select population density, market access, cropland percent, and tropical forest potential vegetation. You can also view the global distribution of each variable by clicking on “Show distribution”, which is demonstrated below.


4. Run the Similarity Analysis

Results are presented as statistics, histograms, and a map. The similarity analysis is calculated with ‘normalized Euclidean distance’. For more details about this method, please see this source, or for a more technical description, please see this source*.

a. Interpret the map. Areas most similar to the site are highlighted in green and areas least similar in red.  If areas that you know are different from the site are green (or similar in red), variables can be changed and the analysis rerun. The statistics window can be hidden by clicking the statistics icon. Categories labeled in the legend correspond to the equal interval bin labels in the similarity histogram (i.e., 0.9-1.0 = extremely similar; 0.0-0.1 = extremely dissimilar; equal interval bins of 0.1).
b. Interpret the histogram. The similarity histogram illustrates the distribution of areas within the specified global extent in terms of their similarity to the selected site (same colors as map).  A big green bar at the far right means there is a large similar area.
c. Interpret the statistics.  The global areas with different levels of similarity to the case study site are computed in km2 and % area.  The first line provides the entire delimited (filtered) extent (as % of global land area) of the analysis, followed by areas with increasing global similarity (as % of delimited area).
5. Collect similar cases. Case studies in the GLOBE database are listed in order of their similarity to the selected case, with the most similar at the top.  Cases can be selected and added to a new or existing collection for later use or analysis.
[Optional] If unsatisfied, rerun the analysis, changing the specified extent and/or variables.
[Optional] Save the similarity analysis.  The analysis is must then be named and can be shared (web link) and/or resumed.
[Optional] Publish the similarity analysis.  The analysis is saved in a permanent (unalterable) format serving as the document of record that can be shared in a citable format (permalink) or printed as a .pdf for inclusion as a supplement in published work.



Strehl, A., Strehl, E., Ghosh, J., and Mooney, R. (2000). Impact of similarity measures on web-page clustering. Workshop on Artificial Intelligence for Web Search, 58-64.

© Copyright 2012 GLOBE. All Rights Reserved.
goin up