pykafka.partitioners

Author: Keith Bourgoin, Emmett Butler

class pykafka.partitioners.RandomPartitioner

Bases: pykafka.partitioners.BasePartitioner

Returns a random partition out of all of the available partitions.

Uses a non-random incrementing counter to provide even distribution across partitions without wasting CPU cycles

class pykafka.partitioners.BasePartitioner

Bases: object

Base class for custom class-based partitioners.

A partitioner is used by the pykafka.producer.Producer to decide which partition to which to produce messages.

__weakref__

list of weak references to the object (if defined)

class pykafka.partitioners.HashingPartitioner(hash_func=None)

Bases: pykafka.partitioners.BasePartitioner

Returns a (relatively) consistent partition out of all available partitions based on the key.

Messages that are published with the same keys are not guaranteed to end up on the same broker if the number of brokers changes (due to the addition or removal of a broker, planned or unplanned) or if the number of topics per partition changes. This is also unreliable when not all brokers are aware of a topic, since the number of available partitions will be in flux until all brokers have accepted a write to that topic and have declared how many partitions that they are actually serving.

__call__(partitions, key)
Parameters:
  • partitions (sequence of pykafka.base.BasePartition) – The partitions from which to choose
  • key (Any hashable type if using the default hash() implementation, any valid value for your custom hash function) – Key used for routing
Returns:

A partition

Return type:

pykafka.base.BasePartition

__init__(hash_func=None)
Parameters:hash_func (function) – hash function (defaults to hash()), should return an int. If hash randomization (Python 2.7) is enabled, a custom hashing function should be defined that is consistent between interpreter restarts.
class pykafka.partitioners.GroupHashingPartitioner(hash_func, group_size=1)

Bases: pykafka.partitioners.BasePartitioner

Messages published with the identical keys will be directed to a consistent subset of ‘n’ partitions from the set of available partitions. For example, if there are 16 partitions and group_size=4, messages with the identical keys will be shared equally between a subset of four partitions, instead of always being directed to the same partition.

The same guarantee caveats apply as to the pykafka.base.HashingPartitioner.

__call__(partitions, key)
Parameters:
  • partitions (sequence of pykafka.base.BasePartition) – The partitions from which to choose
  • key (Any hashable type if using the default hash() implementation, any valid value for your custom hash function) – Key used for routing
Returns:

A partition

Return type:

pykafka.base.BasePartition

__init__(hash_func, group_size=1)
Parameters:
  • hash_func (function) – A hash function
  • group_size (Integer value between (0, total_partition_count)) – Size of the partition group to assign to. For example, if there are 16 partitions, and we want to smooth the distribution of identical keys between a set of 4, use 4 as the group_size.