Parallel algorithms

Parallel algorithms#

graph-tool has support for shared memory parallelization via OpenMP for many algorithms, as is indicated in their docstring.

OpenMP support is optional, and must be enabled during compilation. To check whether it is available, the function openmp_enabled() can be used.

graph_tool.openmp_enabled()[source]

Return True if OpenMP was enabled during compilation.

Several parallelization parameters can be controlled at runtime, including the number of threads, the work sharing schedule, and the minimum number of nodes required for parallelization to be enabled.

graph_tool.openmp_get_num_threads()[source]

Return the number of OpenMP threads.

graph_tool.openmp_set_num_threads(n)[source]

Set the number of OpenMP threads.

graph_tool.openmp_get_schedule()[source]

Return the runtime OpenMP schedule and chunk size. The schedule can by any of: "static", "dynamic", "guided", "auto".

graph_tool.openmp_set_schedule(schedule, chunk=0)[source]

Set the runtime OpenMP schedule and chunk size. The schedule can by any of: "static", "dynamic", "guided", "auto".

graph_tool.openmp_get_thresh()[source]

Return the minimum number of vertices necessary to enable parallelization.

graph_tool.openmp_set_thresh(n)[source]

Set the the minimum number of vertices necessary to enable parallelization.

It’s possible to set these parameter temporarily using a context manager:

graph_tool.openmp_context(nthreads=None, schedule=None, chunk=0, thresh=None)[source]

Return a context manager that sets the tuntime OpenMP parameters, and restores the original values when exited.

For example, to constrain temporarily the number of threads to 3 and use a “guided” scheduling one could do:

>>> g = gt.collection.data["polblogs"]
>>> with gt.openmp_context(nthreads=3, schedule="guided"):
...     ret = gt.pagerank(g)

Parallelization can be disabled altogether in the same way, but using nthreads=1.

Warning

By default, graph-tool will configure the OpenMP wait policy to “passive”, since this usually results in improved performance for the majority of algorithms.

In order for change this behavor, the following environment variable should be set before the Python interpreted is evoked:

export OMP_WAIT_POLICY=active

or alternatively from Python before graph-tool is imported:

import os
os.environ["OMP_WAIT_POLICY"] = "active"

Due to an OpenMP API limitation, this can no longer be changed after graph-tool has been imported.

The global interpreter lock (GIL)#

graph-tool releases Python’s GIL as soon as the C++ implementations are reached, even for algorithms that are not implemented in parallel with OpenMP. This means that Python’s threading functionally can be used with many functions to achieve parallelism. For example, the following code will run several calls of subgraph_isomorphism() in parallel using 16 threads, which each individual call running sequentially:

>>> from concurrent.futures import ThreadPoolExecutor
>>> g = gt.collection.data["netscience"]
>>> def find_sub():
...     u = gt.random_graph(11, lambda: 4, directed=False, model="erdos")
...     gt.subgraph_isomorphism(u, g, max_n=100)
>>> with ThreadPoolExecutor(max_workers=16) as executor:
...     futures = [executor.submit(find_sub) for i in range(16)]
...     for future in futures:
...         future.result()

The same kind of functionality can be achieved with multiprocessing with a nearly identical interface, but the above offers smaller overhead, since no inter-process communication is needed.