How to parallelize your functions
.map()
method.
You might use this for parallelizing computational-heavy tasks, such as batch inference or data processing jobs.
.map()
method can also parallelize functions that require multiple parameters. Simply pass a list of tuples, where each tuple corresponds to a set of arguments for your function.
Below is an example that counts how many prime numbers appear between a start and a stop index for each tuple in ranges:
ranges
is a list of tuples (start, stop)
.count_primes_in_range.map(ranges)
spawns a remote execution for each tuple, passing (start, stop)
to the function..map()
, Beam takes care of distributing each item (or tuple of items) to separate containers for parallel processing. This approach makes it easy to scale out CPU-heavy or data-intensive tasks with minimal code.