public class DistinctDataNet<E> extends DistinctDataBag<E>
DistinctDataBag
except that you are informed if the item you just
added was known to be distinct. This will normally only work until the first spill. After that,
the system may not be able to tell for sure, and will thus return false. When you are finished
adding items, you may call netIterator()
to get any distinct items that are in the
spill files but were not indicated as distinct previously. This is useful for a distinct
operator that streams results until it exceeds the spill threshold.Constructor and Description |
---|
DistinctDataNet(ThresholdPolicy<E> policy,
SerializationFactory<E> serializerFactory,
Comparator<E> comparator) |
Modifier and Type | Method and Description |
---|---|
boolean |
netAdd(E item) |
Iterator<E> |
netIterator()
Returns an iterator to all additional items that are distinct but were
not reported to be so at the time
netAdd(Object) was invoked. |
isDistinct, isSorted, iterator
add, close, flush
public DistinctDataNet(ThresholdPolicy<E> policy, SerializationFactory<E> serializerFactory, Comparator<E> comparator)
public boolean netAdd(E item)
public Iterator<E> netIterator()
netAdd(Object)
was invoked.
If you do not exhaust the iterator, you should call Iter.close(Iterator)
to be sure any open file handles are closed.Licenced under the Apache License, Version 2.0