multithreading - Multithreaded merge, best java threading practices recommended -
problem - merge 1000 hash-maps single map. assume each hashmap contained alphabets , frequency 1 page of book, , book had 1000 pages. have scanned through each page , created 1000 hashmaps, , want reduce / merge them. has done taking advantage of multithreading. note - not using hadoop has done on single machine. question tailor made doubts solved please refrain answers suggest by-pass threading.
is typical problem known solution ? if yes please point me reference links.
if not, how go reduce-merge problem, given threads dont return values ? here suggested approach. work in divide-conquer manner. first spawn 500 threads each combining 2 maps, spawn 250 threads each combining 2 merged maps ... , on. oppositions ? better ideas ?
if can use java 8 use parallel stream job done in parallel:
list<map<string, integer>> maps = new arraylist<>(); //populate: 1 map per page map<string, integer> summary = maps.parallelstream() .flatmap(m -> m.entryset().stream()) .collect(tomap(entry::getkey, entry::getvalue, (i1, i2) -> i1 + i2));
with java < 8 need parallelisation yourself, example using fork/join framework (what parallelstream
under hood) or executorservice
.
in case, cpu intensive task, spawning more threads number of processors on machine counterproductive, unless run beast 500 cores, don't start 500 threads.
complete example:
public static void main(string[] args) { list<map<string, integer>> maps = new arraylist<>(); maps.add(map("a cat , dog , cat , dog")); maps.add(map("a hat , man , man , cat")); maps.add(map("a cat , dog , cat , dog")); maps.add(map("a hat , man , man , cat")); maps.add(map("a cat , dog , cat , dog")); maps.add(map("a hat , man , man , cat")); system.out.println(maps); map<string, integer> summary = maps.parallelstream() .flatmap(m -> m.entryset().stream()) //what thread on? .peek(e -> system.out.println(thread.currentthread())) .collect(tomap(entry::getkey, entry::getvalue, (i1, i2) -> i1 + i2)); system.out.println("summary = " + summary); } private static map<string, integer> map(string text) { map<string, integer> map = new hashmap<>(); (string s : text.split("\\s+")) { integer count = map.getordefault(s, 0) + 1; map.put(s, count); } return map; }
Comments
Post a Comment