Counter counts recurrent elements of iterables. It is based on the Python implementation.
The struct Counter
is the entry-point type for this module.
Mathematically, a Counter
implements a hash-based version of a [multiset],
or bag. This is simply an extension of the notion of a set to the idea that
we care not only about whether an entity exists within the set, but the number
of occurrences within the set. Normal set operations such as intersection,
union, etc. are of course still supported.
rust
use counter::Counter;
let char_counts = "barefoot".chars().collect::<Counter<_>>();
let counts_counts = char_counts.values().collect::<Counter<_>>();
rust
let mut counts = "aaa".chars().collect::<Counter<_>>();
counts[&'a'] += 1;
counts[&'b'] += 1;
rust
let mut counts = "able babble table babble rabble table able fable scrabble"
.split_whitespace().collect::<Counter<_>>();
// add or subtract an iterable of the same type
counts += "cain and abel fable table cable".split_whitespace();
// or add or subtract from another Counter of the same type
let other_counts = "scrabble cabbie fable babble"
.split_whitespace().collect::<Counter<_>>();
let difference = counts - other_counts;
Extend a Counter
with another Counter
:
rust
let mut counter = "abbccc".chars().collect::<Counter<_>>();
let another = "bccddd".chars().collect::<Counter<_>>();
counter.extend(&another);
let expect = [('a', 1), ('b', 3), ('c', 5), ('d', 3)].iter()
.cloned().collect::<HashMap<_, _>>();
assert_eq!(counter.into_map(), expect);
rust
let counts = "aaa".chars().collect::<Counter<_>>();
assert_eq!(counts[&'a'], 3);
assert_eq!(counts[&'b'], 0);
[most_common_ordered()
] uses the natural ordering of keys which are [Ord
].
rust
let by_common = "eaddbbccc".chars().collect::<Counter<_>>().most_common_ordered();
let expected = vec![('c', 3), ('b', 2), ('d', 2), ('a', 1), ('e', 1)];
assert!(by_common == expected);
[k_most_common_ordered()
] takes an argument k
of type usize
and returns the top k
most
common items. This is functionally equivalent to calling most_common_ordered()
and then
truncating the result to length k
. However, if k
is smaller than the length of the counter
then k_most_common_ordered()
can be more efficient, often much more so.
rust
let by_common = "eaddbbccc".chars().collect::<Counter<_>>().k_most_common_ordered(2);
let expected = vec![('c', 3), ('b', 2)];
assert!(by_common == expected);
For example, here we break ties reverse alphabetically.
rust
let counter = "eaddbbccc".chars().collect::<Counter<_>>();
let by_common = counter.most_common_tiebreaker(|&a, &b| b.cmp(&a));
let expected = vec![('c', 3), ('d', 2), ('b', 2), ('e', 1), ('a', 1)];
assert!(by_common == expected);
Counters are multi-sets and so can be sub- or supersets of each other.
A counter is a subset of another if for all its elements, the other
counter has an equal or higher count. Test for this with [is_subset()
]:
rust
let counter = "aaabb".chars().collect::<Counter<_>>();
let superset = "aaabbbc".chars().collect::<Counter<_>>();
let not_a_superset = "aaae".chars().collect::<Counter<_>>();
assert!(counter.is_subset(&superset));
assert!(!counter.is_subset(¬_a_superset));
Testing for a superset is the inverse, [is_superset()
] is true if the counter can contain another counter in its entirety:
rust
let counter = "aaabbbc".chars().collect::<Counter<_>>();
let subset = "aabbb".chars().collect::<Counter<_>>();
let not_a_subset = "aaae".chars().collect::<Counter<_>>();
assert!(counter.is_superset(&subset));
assert!(!counter.is_superset(¬_a_subset));
These relationships continue to work when using a signed integer type for the counter: all values in the subset must be equal or lower to the values in the superset. Negative values are interpreted as 'missing' those values, and the subset would need to miss those same elements, or be short more, to still be a subset:
rust
let mut subset = "aaabb".chars().collect::<Counter<_, i8>>();
subset.insert('e', -2); // short 2 'e's
subset.insert('f', -1); // and 1 'f'
let mut superset = "aaaabbb".chars().collect::<Counter<_, i8>>();
superset.insert('e', -1); // short 1 'e'
assert!(subset.is_subset(&superset));
assert!(superset.is_superset(&subset));
You can intersect two counters, giving you the minimal counts of their
combined elements using the &
bitwise and operator, and produce
their union with the maximum counts using |
bitwise or:
```rust
let a = "aaabb".chars().collect::
let intersection = a & b;
let expectedintersection = "aabb".chars().collect::
let c = "aaabb".chars().collect::
let union = c | d;
let expectedunion = "aaabbbbe".chars().collect::
The in-place [&=
] and [|=
] operations are also supported.
HashMap
Counter<T, N>
implements [Deref
]<Target=HashMap<T, N>>
and
[DerefMut
]<Target=HashMap<T, N>>
, which means that you can perform any operations
on it which are valid for a [HashMap
].
rust
let mut counter = "aa-bb-cc".chars().collect::<Counter<_>>();
counter.remove(&'-');
assert!(counter == "aabbcc".chars().collect::<Counter<_>>());
Note that Counter<T, N>
itself implements [Index
]. Counter::index
returns a reference to
a [Zero::zero
] value for missing keys.
rust
let counter = "aaa".chars().collect::<Counter<_>>();
assert_eq!(counter[&'b'], 0);
// panics
// assert_eq!((*counter)[&'b'], 0);
Hash + Eq
You can't use the most_common*
functions unless T
is also [Clone
], but simple counting
works fine on a minimal data type.
```rust
struct Inty { i: usize, }
impl Inty { pub fn new(i: usize) -> Inty { Inty { i: i } } }
// https://en.wikipedia.org/wiki/867-5309/Jenny let intys = vec![ Inty::new(8), Inty::new(0), Inty::new(0), Inty::new(8), Inty::new(6), Inty::new(7), Inty::new(5), Inty::new(3), Inty::new(0), Inty::new(9), ];
let intycounts = intys.iter().collect::
Sometimes [usize
] just isn't enough. If you find yourself overflowing your
machine's native size, you can use your own type. Here, we use an [i8
], but
you can use most numeric types, including bignums, as necessary.
rust
let counter: Counter<_, i8> = "abbccc".chars().collect();
let expected: HashMap<char, i8> = [('a', 1), ('b', 2), ('c', 3)].iter().cloned().collect();
assert!(counter.into_map() == expected);
License: MIT