Efficient Lock-Free Work-stealing Iterators for Data-Parallel Collections
Aleksandar Prokopec, Dmitry Petrashko and Martin Odersky

Data-parallel
hugeCollectionOfInts.sum()
has low work per element
Real life:
irregularities
Irregular workload
collection.map(x => fun(x))
WorkStealing scheduler
Threads steal work from each other.
No cooperation from target thread needed, lock free.
A. Prokopec and M. Odersky. Near optimal work-stealing tree scheduler for highly irregular data-parallel workloads. LCPC 2013
Even more real life:
Overheads
Types of overhead
- Data access overhead,
- Scheduling overhead,
- Invocation overhead. Won't be covered here.
Data access overhead
Data consumer abstracts over data source:
Inefficiencies:
- iteration scheme(eg iterate over tree as if over indexed collection),
- redundant data copying,
- boxing (Jvm, Python, Ruby, JavaScript).
How severe is it?
| unboxed | boxed |
| a + b | box(unbox(a) + unbox(a)) |
~13x slowdown
Scheduling overhead
Lets say we have a HashMap:
Most HashMaps by nature create irregular workload.
Efficient scheduling has to be data-structure aware.
Work-stealing Iterators
Define:
- how to iterate;
- how to steal work.
Transparent for user.
Easy to implement for developer.
Benchmarks
For cycle:
Benchmarks
Irregular workload:
More details?
See paper :-).
Or ask questions.
Takeaway message
Eliminating abstraction penalties is necessary to realistically assess the scheduling quality.
High abstraction penalties could entirely mask scheduling penalties.
Efficient Lock-Free Work-stealing Iterators for Data-Parallel Collections
Aleksandar Prokopec, Dmitry Petrashko and Martin Odersky