Enhanced C#
Language of your choice: library documentation
Nested classes | Public fields | Public static fields | Properties | Public Member Functions | Static Public Member Functions | List of all members
Loyc.Collections.Impl.InternalSet< T > Struct Template Reference

A hash-trie data structure for use inside other data structures. More...


Source file:
Inheritance diagram for Loyc.Collections.Impl.InternalSet< T >:

Remarks

A hash-trie data structure for use inside other data structures.

InternalSet<T> is a dual-mode mutable/immutable "hash trie", which is a kind of tree that is built from the hashcode of the items it contains. It supports fast cloning, and is suitable as a persistent data structure.

InternalSet<T> is not designed to be used by itself, but as a building block for other data structures. It has no Count property because it does not know its own size; the outer data structure must track the size if the size is needed. The lack of a Count property allows an empty InternalSet to use a mere single word of memory!

This is my second implementation of InternalSet. The original version used memory very efficiently for reference types, but required boxing for value types; this version needs more memory, but is moderately faster in most cases and supports value types without boxing. I estimate that InternalSet (the second version) uses roughly the same amount of memory as HashSet<T> (actually more or less depending on the number of items in the set, and on the hashcode distribution.)

Collection classes based on InternalSet are most efficient for small sets, but if you always need small sets then a simple wrapper around HashSet would suffice. In fact, despite my best efforts, this data type rarely outperforms HashSet, but this is because HashSet<T> is quite fast, not because InternalSet<T> is slow. Still, there are several reasons to consider using a collection class based on InternalSet instead of HashSet<T>:

InternalSet is not efficient for Ts that are expensive to compare; unlike standard .NET collections, this data structure does not store the hashcode of each item inside the collection. The memory saved by not storing the hashcode compensates for the extra memory that InternalSet tends to require due to its structure.

As I was saying, this data structure is inspired by Clojure's PersistentHashMap. Whereas PersistentHashMap uses nodes of size 32, I chose to use nodes of size 16 in order to increase space efficiency for small sets; for some reason I tend to design programs that use many small collections and a few big ones, so I tend to prefer designs that stay efficient at small sizes.

So InternalSet is a tree of nodes, with each level of the tree representing 4 bits of the hashcode. Slots in the root node are selected based on bits 0 to 3 of the hashcode, slots in children of the root are selected based on bits 4 to 7 of the hashcode, and so forth. Here's a diagram:

                             _root*
* IsFrozen=true                |
                               |
      +---------+---------+----+----+---------+---------+
      |         |         |         |         |         |
     0x2       0x3       0x6       0x7       0x9       0xF
                |                   |         |
             +--+--+                |      +--+--+
             |     |                |      |     |
           0x13   0x73             0x57  0x09   0x59

Each of the 12 nodes on this diagram has 16 slots for items of type T, and the 4 nodes that have children have 16 additional slots for references to children. The numbers on the nodes represent their role in the tree; for example:

Technically, this data structure has O(log N) time complexity for search, insertion and removal. However, it's a base-16 logarithm and maxes out at 8 levels, so it is faster than typical O(log N) algorithms that are base-2. At smaller sizes, its speed is similar to a conventional hashtable, and some operations are still efficient at large sizes, too.

Unlike InternalList<T>, new InternalSet<T>() is a valid empty set. Moreover, because the root node is never changed after it is created (unless you modify it while it is frozen), all copies of an InternalSet<T> represent the same set unless the set is frozen with CloneFreeze; see Thaw() for more information.

The neatest feature of this data structure is fast cloning and subtree sharing. You can call CloneFreeze to freeze/clone the trie in O(1) time; this freezes the root node (a transitive property that implicitly affects all children), but still permits the hashtrie to be modified by copying nodes on-demand. Thus the trie is actually frozen, but copy-on-write behavior provides the illusion that it is still editable.

This data structure is designed to support classes that contain mutable data, so that it can be used to construct dictionaries; that is, it allows T values that have an immutable "key" part and a mutable "value" part. Call Find to retrieve the value associated with a key, and call Add with replaceIfPresent=true to change the "value" associated with a key. The Map<K,V> and MMap<K,V> classes rely on this feature to implement a dictionary.

How it works: I call this data structure a "hash-trie" because it blends properties of hashtables and tries. It places items into a tree by taking their hashcode and dividing it into 8 groups of 4 bits, starting at the least significant bits. Each group of 4 bits is used to select a location in the tree/trie, and each node of the tree always has 16 items (and 16 children, if it has any children at all.) For example, consider a tree with 7 items that have the following hash codes:

The top level of the trie represents the lowest 4 bits of the hashcode. Since each node has 16 items, 7 items can usually fit in a single node, but in this case there are too many hashcodes that end with "1", causing a node split:

                           |0|1|2|3|4|5|6|7|8|9|A|B|C|D|E|F|
       _root ==> _items    | |!|!|M|!| | | | | | | |K| | | |
                 _children | |*| | | | | | | | | | | | | | |
                           |0|1|2|3|4|5|6|7|8|9|A|B|C|D|E|F|
* child node ==> _items    |O|P| | | |N| | | |L| |J| | | | |
                 _children (null)
("!" represents the deleted flag, which indicates that an item was 
 once present at this location.)

The second level of the trie represents bits 4-7, which is the second- last hex digit. You can see, for example, that the second-last digit of N is 5, therefore N is stored at index 5 of the child node.

In case hashcodes of different objects collide at a particular digit, adjacent array elements can be used to hold the different objects that share the same 4-bit sub-hashcode; this is a bounded-time variation on the linearly-probed hashtable. In this example, both O and P have zero as their second-last digit. Assuming O is added first, it takes slot [0]; then P takes slot [1]. Up to 3 adjacent entries can be used for a given hashcode; therefore, when searching for an entry it is necessary to search up to 4 locations in each node: the preferred location, plus 3 adjacent locations.

For example, support we search for an item X that is not in the set and has hashcode 0xCCA9A241. In that case, the Find methods starts with the least-significant digit, 1. This points us to the child slot; an invariant of our hashtrie is that if there is a child node, all items with the corresponding sub-hashcode must be placed in the child node. Therefore it is impossible, for example, that X could be located at index 2 of the root node; the existence of the child node guarantees that it is not there. So the Find method looks inside the child node, at index 4 (the second-last digit of X's hashcode) and finds nothing. It also looks at indexes 5, 6, and 7, comparing N to X in the process. Since none of these slots contain X, the Find method returns false.

Something unfortunate happens if five or more objects have the same hashcode: it forces the tree to have maximum depth. Since a particular hashcode can only be repeated four times in a single node, upon adding a fifth item with the same hashcode, child nodes are created for all 8 digits of the hashcode. At the 8th level, a special node type is allocated that contains, in addition to the usual 16 slots, a list of "overflow slots" holds items that cannot fit in the normal slots due to excessive collisions. All of this has a substantial memory penalty; to avoid this problem, use a better hash function that does not create false collisions.

If there are more than 16 items that share the same 28 lower-order bits, the overflow area on the 8th level node will expand to hold all of these items; this is the only way that a node can have more than 16 items.

Fast cloning works by setting the "IsFrozen" flag on the root node. When a node is frozen, all its children are frozen implicitly; since the children are not marked right away, the CloneFreeze method can return immediately. The frozen flag will be propagated from parents to children lazily, when the tree is modified later.

To "thaw" a node, a copy is made of that node and all of its parents. For example, suppose that the following tree is frozen and cloned:

                             _root*
* IsFrozen=true                |
                               |
      +---------+---------+----+----+---------+---------+
      |         |         |         |         |         |
     0x2       0x3       0x6       0x7       0x9       0xF
                |                   |         |
             +--+--+                |      +--+--+
             |     |                |      |     |
           0x13   0x73             0x57  0x09   0x59

Remember, only the root's IsFrozen flag is set at first; all other nodes do not have the frozen flag yet.

Now suppose that an item is added to node 0x9 (e.g. something with hashcode 0x39 could go in this node). Before the new item can be placed in node 0x9, it must be thawed. To thaw it, an unfrozen copy is made, leaving the original untouched. The copy is not frozen, but it does point to the same frozen children (0x09 and 0x59), so a for-loop sets the IsFrozen flag of each child. Then, the new item is added to the copy of node 0x9. Next, the _root is also unfrozen by making a copy of it with IsFrozen=false. Again, a for-loop sets the IsFrozen flag of each frozen child, and then child slot [9] in the root is replaced with the new copy of 0x9 (which has the new item).

This concludes the thawing process. So at this point, just two nodes are actually unfrozen, and the modified tree looks like this:

! Unfrozen copy              _root!
* IsFrozen=true                |
                               |
      +---------+---------+----+----+---------+---------+
      |         |         |         |         |         |
     0x2*      0x3*      0x6*      0x7*      0x9!      0xF*
                |                   |         |
             +--+--+                |      +--+--+
             |     |                |      |     |
           0x13   0x73             0x57  0x09*  0x59*

There are 12 nodes here and 2 have been copied. The other 10 nodes are still shared between the modified tree and the clone. Next, if you add an item to node 0x6, only that one node has to be thawed; the root has already been thawed and there is no need to make another copy of it. Due to the random nature of hashcodes, it is probable that as you modify the set after cloning it, it is typical for each modification to require approximately one node to be thawed, until the majority of the nodes have been thawed.

InternalSet does not thaw unnecessarily. If you try to remove an item that is not present, none of the tree will be thawed. If you add an item that is already present in a frozen node (and you do not ask for replacement), that node will not be thawed. Contains and Find never cause thawing.

I am not aware whether a data structure quite like this has been described in the comp-sci literature or not (although it probably has). If you see something like this in a paper, let me know.

When attempting to insert a new item in a node, the first available empty slot will be used; and when searching for an item, the search stops at an empty slot. For example, suppose that the root node contains these items:

                    |0|1|2|3|4|5|6|7|8|9|A|B|C|D|E|F|
_root ==> _items    |A| |C| |E|F| | |I| |K|L| |N| | |

Now suppose that you are searching for, or adding, or an item 'D' whose hashcode ends with '3'. Slot 3 is empty, and this data structure works in such a way that the search for 'D' can end immediately with a result of 'false', or it can be added at slot 2 immediately without comparing 'D' with slots 4, 5 and 6 which (if 2 were not empty) might already contain 'D'.

The reasoning behind this rule is that if 'D' already existed in the set, slot 2 should not be empty; since it is empty, 'D' must not be in the set already. However, deletions could violate this logic. For example, imagine that we add two items, first 'd' and then 'D', which both have a hashcode that ends in '3'. Then the node would look like this:

                    |0|1|2|3|4|5|6|7|8|9|A|B|C|D|E|F|
_root ==> _items    |A| |C|d|E|F|D| |I| |K|L| |N| | |

Next, you delete 'd'. Imagine that this leaves the node in the following state:

                    |0|1|2|3|4|5|6|7|8|9|A|B|C|D|E|F|
_root ==> _items    |A| |C| |E|F|D| |I| |K|L| |N| | |

Now 'D' is left outside its 'home' location of 3. If you then attempt to add 'D' to the set, a duplicate copy would be added at position '3'! Or if you search for 'D' instead, the result would be 'false' even though D is present in the set.

I thought of two solutions to this problem; the first was to 'fix' the node after a deletion so that 'D' would move from slot 6 to 3. But there's a big problem with this solution because InternalSet<T>.Enumerator has a RemoveCurrent() method which is supposed to delete the current item and move to the next one. If the node had to be rearranged in response to a deletion, it would be very difficult to guarantee that the enumerator still returns each item in the set exactly once.

The second solution, which I actually implemented, puts a special "deleted" marker in slot 3 (denoted ! on the first diagram). This marker forces the search routine to compare the item being added or searched for with other slots beyond the current one, but otherwise it behaves like an empty slot.

There is a third solution–always check all four possible slots. But the comparison is not always cheap, so InternalSet<T> does not use this solution unless you are using null as the value of the IEqualityComparer<T>.

Since InternalSet<T> can hold any value of type T, the "deleted" and "empty/in use" indicators cannot physically be stored in the slots of type T. Instead, these indicators are stored separately, with 16 bits for "deleted" flags and 16 bits for "used" flags.

During a normal delete operation, if a node has no children and is using only one or two slots after an item is deleted, the parent is checked for empty slots to find out whether the child is really necessary. If there are enough free slot(s) in the parent node, the remaining items in the child are transferred back back to the parent and the child is deleted (the reference to it is cleared to null).

Unfortunately, this behavior is not available when you call Enumerator.RemoveCurrent. In order to maintain the integrity of the enumerator, a child node will not be deleted during a call to RemoveCurrent unless the node is completely empty after the removal. Consequently, the tree will use extra memory if you remove most, but not all, items from the set using RemoveCurrent.

By the way, unlike the original implementation, this version of InternalSet allows 'null' to be a member of the set.

Interesting fact: it is possible for two sets to be equal (contain the same items), and yet for those items to be enumerated in different orders in the two sets.

Nested classes

struct  Enumerator
 

Public fields

Node _root
 

Public static fields

static readonly InternalSet< T > Empty = new InternalSet<T> { _root = FrozenEmptyRoot() }
 An empty set. More...
 
static readonly IEqualityComparer< T > DefaultComparer = typeof(IReferenceEquatable).IsAssignableFrom(typeof(T)) ? null : EqualityComparer<T>.Default
 This is EqualityComparer<T>.Default, or null if T implements IReferenceEquatable. More...
 
const int BitsPerLevel = 4
 
const int FanOut = 1 << BitsPerLevel
 
const int Mask = FanOut - 1
 
const int MaxDepth = 7
 
const uint FlagMask = (uint)((1L << FanOut) - 1)
 
const int CounterPerChild = FanOut << 1
 
const short OverflowFlag = 1 << 12
 
static readonly OnFoundExisting AddIfNotPresent = _IgnoreExisting_
 
static readonly OnFoundExisting AddOrReplace = _ReplaceExisting_
 
static readonly OnFoundExisting RemoveMode = _DeleteExisting_
 
static Enumerator _setOperationEnumerator
 

Properties

bool IsRootFrozen [get]
 
bool HasRoot [get]
 

Public Member Functions

 InternalSet (IEnumerable< T > list, IEqualityComparer< T > comparer, out int count)
 
 InternalSet (IEnumerable< T > list, IEqualityComparer< T > comparer)
 
int GetSetHashCode (IEqualityComparer< T > comparer)
 
InternalSet< T > CloneFreeze ()
 Freezes the hashtrie so that any further changes require paths in the tree to be copied. More...
 
void Thaw ()
 Thaws a frozen root node by duplicating it, or creates the root node if the set doesn't have one. More...
 
bool Add (ref T item, IEqualityComparer< T > comparer, bool replaceIfPresent)
 Tries to add an item to the set, and retrieves the existing item if present. More...
 
bool Remove (ref T item, IEqualityComparer< T > comparer)
 Removes an item from the set. More...
 
delegate bool OnFoundExisting (ref Node slots, int i, T item)
 
bool Find (ref T item, IEqualityComparer< T > comparer)
 
Enumerator GetEnumerator ()
 
IEnumerator< T > IEnumerable< T >. GetEnumerator ()
 
System.Collections.IEnumerator System.Collections.IEnumerable. GetEnumerator ()
 
void CopyTo (T[] array, int arrayIndex)
 
void Clear ()
 
bool Contains (T item, IEqualityComparer< T > comparer)
 
int Count ()
 
int UnionWith (IEnumerable< T > other, IEqualityComparer< T > thisComparer, bool replaceIfPresent)
 Adds the contents of 'other' to this set. More...
 
int UnionWith (InternalSet< T > other, IEqualityComparer< T > thisComparer, bool replaceIfPresent)
 
int IntersectWith (InternalSet< T > other, IEqualityComparer< T > otherComparer)
 Removes all items from this set that are not present in 'other'. More...
 
int IntersectWith (ISet< T > other)
 
int IntersectWith (IEnumerable< T > other, IEqualityComparer< T > comparer)
 Removes all items from this set that are not present in 'other'. More...
 
int ExceptWith (IEnumerable< T > other, IEqualityComparer< T > thisComparer)
 Removes all items from this set that are present in 'other'. More...
 
int ExceptWith (InternalSet< T > other, IEqualityComparer< T > thisComparer)
 
int SymmetricExceptWith (InternalSet< T > other, IEqualityComparer< T > thisComparer)
 
int SymmetricExceptWith (IEnumerable< T > other, IEqualityComparer< T > comparer, bool xorDuplicates=true)
 Modifies the current set to contain only elements that were present either in this set or in the other collection, but not both. More...
 
bool IsSubsetOf (ISet< T > other, int myMinCount)
 Returns true if all items in this set are present in the other set. More...
 
bool IsSubsetOf (InternalSet< T > other, IEqualityComparer< T > otherComparer)
 
bool IsSubsetOf (IEnumerable< T > other, IEqualityComparer< T > comparer, int myMinCount=0)
 
bool IsSupersetOf (IEnumerable< T > other, IEqualityComparer< T > thisComparer, int myMaxCount=int.MaxValue)
 Returns true if all items in the other set are present in this set. More...
 
bool IsSupersetOf (InternalSet< T > other, IEqualityComparer< T > thisComparer)
 
bool Overlaps (IEnumerable< T > other, IEqualityComparer< T > thisComparer)
 Returns true if this set contains at least one item from 'other'. More...
 
bool Overlaps (InternalSet< T > other, IEqualityComparer< T > thisComparer)
 
bool IsProperSubsetOf (ISet< T > other, int myExactCount)
 Returns true if all items in this set are present in the other set, and the other set has at least one item that is not in this set. More...
 
bool IsProperSubsetOf (IEnumerable< T > other, IEqualityComparer< T > comparer, int myExactCount)
 Returns true if all items in this set are present in the other set, and the other set has at least one item that is not in this set. More...
 
bool IsProperSupersetOf (ISet< T > other, IEqualityComparer< T > thisComparer, int myExactCount)
 Returns true if all items in the other set are present in this set, and this set has at least one item that is not in the other set. More...
 
bool IsProperSupersetOf (IEnumerable< T > other, IEqualityComparer< T > comparer, int myExactCount)
 Returns true if all items in the other set are present in this set, and this set has at least one item that is not in the other set. More...
 
bool SetEquals (ISet< T > other, int myExactCount)
 Returns true if this set and the other set have the same items. More...
 
bool SetEquals (IEnumerable< T > other, IEqualityComparer< T > comparer, int myExactCount)
 Returns true if this set and the other set have the same items. More...
 
int CountMemory (int sizeOfT)
 Measures the total size of all objects allocated to this collection, in bytes, including the size of InternalSet<T> itself (which is one word). More...
 
int CountMemory (int sizeOfT, out InternalSetStats stats)
 Measures the total size of all objects allocated to this collection, in bytes, and counts the number of nodes of different types. More...
 

Static Public Member Functions

static Node FrozenEmptyRoot ()
 
static int Adj (int i, int n)
 
static bool Equals (T value, ref T item, IEqualityComparer< T > comparer)
 
static uint GetHashCode (T item, IEqualityComparer< T > comparer)
 
static void PropagateFrozenFlag (Node parent, Node child)
 
static void ReplaceChild (ref Node slots, int iHome, Node newChild)
 
static bool TryRemoveChild (ref Node slots, int iHome, Node child)
 
static bool _IgnoreExisting_ (ref Node slots, int i, T item)
 
static bool _ReplaceExisting_ (ref Node slots, int i, T item)
 
static bool _DeleteExisting_ (ref Node slots, int i, T item)
 
static bool AddOrRemove (ref Node slots, ref T item, uint hc, IEqualityComparer< T > comparer, OnFoundExisting mode)
 
static bool OnFoundInOverflow (ref Node slots, int i, ref T item, OnFoundExisting mode, T existing)
 
static int SelectBucketToSpill (Node slots, int i0, IEqualityComparer< T > comparer)
 
static void Spill (Node parent, int i0, IEqualityComparer< T > comparer)
 
static Enumerator SetOperationEnumerator ()
 

Member Function Documentation

◆ Add()

bool Loyc.Collections.Impl.InternalSet< T >.Add ( ref T  item,
IEqualityComparer< T >  comparer,
bool  replaceIfPresent 
)
inline

◆ CloneFreeze()

InternalSet<T> Loyc.Collections.Impl.InternalSet< T >.CloneFreeze ( )
inline

Freezes the hashtrie so that any further changes require paths in the tree to be copied.

This is an O(1) operation. It causes all existing copies of this InternalSet<T>, as well as any other copies you make in the future, to become independent of one another so that modifications to one copy do not affect any of the others.

To unfreeze the hashtrie, simply modify it as usual with (for example) a call to Add or Remove, or call Thaw. Frozen parts of the trie are copied on-demand.

◆ CountMemory() [1/2]

int Loyc.Collections.Impl.InternalSet< T >.CountMemory ( int  sizeOfT)
inline

Measures the total size of all objects allocated to this collection, in bytes, including the size of InternalSet<T> itself (which is one word).

Parameters
sizeOfTSize of each T. C# provides no way to get this number so it must be supplied as a parameter. If T is a reference type such as String, IntPtr.Size tells you the size of each reference; please note that this method is does not look "inside" each T, it just measures the "shallow" size of the collection. For instance, if this is a set of strings, then CountMemory(IntPtr.Size) is the size of the set including the references to the strings, but not including the strings themselves.
Returns

Referenced by Loyc.Collections.Set< Loyc.LLParserGenerator.AndPred >.CountMemory(), and Loyc.Collections.Impl.InternalSet< KeyValuePair< K, V > >.CountMemory().

◆ CountMemory() [2/2]

int Loyc.Collections.Impl.InternalSet< T >.CountMemory ( int  sizeOfT,
out InternalSetStats  stats 
)
inline

Measures the total size of all objects allocated to this collection, in bytes, and counts the number of nodes of different types.

◆ ExceptWith()

int Loyc.Collections.Impl.InternalSet< T >.ExceptWith ( IEnumerable< T >  other,
IEqualityComparer< T >  thisComparer 
)
inline

Removes all items from this set that are present in 'other'.

Parameters
otherThe set whose members should be removed from this set.
thisComparerThe comparer for this set (not for 'other', which is simply enumerated).
Returns
Returns the number of items that were removed.

◆ IntersectWith() [1/2]

int Loyc.Collections.Impl.InternalSet< T >.IntersectWith ( IEnumerable< T >  other,
IEqualityComparer< T >  comparer 
)
inline

Removes all items from this set that are not present in 'other'.

Parameters
otherThe set whose members should be kept in this set.
Returns
Returns the number of items that were removed.

This method is costly if 'other' is not a set; a temporary set will be constructed to answer the query. Also, this overload has the same subtle assumption as the other overload.

◆ IntersectWith() [2/2]

int Loyc.Collections.Impl.InternalSet< T >.IntersectWith ( InternalSet< T >  other,
IEqualityComparer< T >  otherComparer 
)
inline

Removes all items from this set that are not present in 'other'.

Parameters
otherThe set whose members should be kept in this set.
otherComparerThe comparer for 'other' (not for this set, which is simply enumerated).
Returns
Returns the number of items that were removed from the set.

Referenced by Loyc.Collections.Impl.InternalSet< KeyValuePair< K, V > >.IntersectWith().

◆ IsProperSubsetOf() [1/2]

bool Loyc.Collections.Impl.InternalSet< T >.IsProperSubsetOf ( IEnumerable< T >  other,
IEqualityComparer< T >  comparer,
int  myExactCount 
)
inline

Returns true if all items in this set are present in the other set, and the other set has at least one item that is not in this set.

This method is costly if 'other' is not a set; a temporary set will be constructed to answer the query. Also, this overload has the same subtle assumption as the other overload.

◆ IsProperSubsetOf() [2/2]

bool Loyc.Collections.Impl.InternalSet< T >.IsProperSubsetOf ( ISet< T >  other,
int  myExactCount 
)
inline

Returns true if all items in this set are present in the other set, and the other set has at least one item that is not in this set.

This implementation assumes that if the two sets use different definitions of equality (different IEqualityComparer<T>s), that neither set contains duplicates from the point of view of the other set. If this rule is broken–meaning, if either of the sets were constructed with the comparer of the other set, that set would shrink– then the results of this method are unreliable. If both sets use the same comparer, though, you have nothing to worry about.

Referenced by Loyc.Collections.Set< Loyc.LLParserGenerator.AndPred >.IsProperSubsetOf(), and Loyc.Collections.Impl.InternalSet< KeyValuePair< K, V > >.IsProperSubsetOf().

◆ IsProperSupersetOf() [1/2]

bool Loyc.Collections.Impl.InternalSet< T >.IsProperSupersetOf ( IEnumerable< T >  other,
IEqualityComparer< T >  comparer,
int  myExactCount 
)
inline

Returns true if all items in the other set are present in this set, and this set has at least one item that is not in the other set.

This method is costly if 'other' is not a set; a temporary set will be constructed to answer the query. Also, this overload has the same subtle assumption as the other overload.

◆ IsProperSupersetOf() [2/2]

bool Loyc.Collections.Impl.InternalSet< T >.IsProperSupersetOf ( ISet< T >  other,
IEqualityComparer< T >  thisComparer,
int  myExactCount 
)
inline

Returns true if all items in the other set are present in this set, and this set has at least one item that is not in the other set.

This implementation assumes that if the two sets use different definitions of equality (different IEqualityComparer<T>s), that neither set contains duplicates from the point of view of the other set. If this rule is broken–meaning, if either of the sets were constructed with the comparer of the other set, that set would shrink– then the results of this method are unreliable. If both sets use the same comparer, though, you have nothing to worry about.

Referenced by Loyc.Collections.Set< Loyc.LLParserGenerator.AndPred >.IsProperSupersetOf(), and Loyc.Collections.Impl.InternalSet< KeyValuePair< K, V > >.IsProperSupersetOf().

◆ IsSubsetOf()

bool Loyc.Collections.Impl.InternalSet< T >.IsSubsetOf ( ISet< T >  other,
int  myMinCount 
)
inline

Returns true if all items in this set are present in the other set.

Parameters
myMinCountSpecifies the minimum number of items that this set contains (use 0 if unknown)

Referenced by Loyc.Collections.Impl.InternalSet< KeyValuePair< K, V > >.IsProperSubsetOf(), Loyc.Collections.Set< Loyc.LLParserGenerator.AndPred >.IsSubsetOf(), and Loyc.Collections.Impl.InternalSet< KeyValuePair< K, V > >.SetEquals().

◆ IsSupersetOf()

bool Loyc.Collections.Impl.InternalSet< T >.IsSupersetOf ( IEnumerable< T >  other,
IEqualityComparer< T >  thisComparer,
int  myMaxCount = int.MaxValue 
)
inline

◆ Overlaps()

bool Loyc.Collections.Impl.InternalSet< T >.Overlaps ( IEnumerable< T >  other,
IEqualityComparer< T >  thisComparer 
)
inline

Returns true if this set contains at least one item from 'other'.

Referenced by Loyc.Collections.Set< Loyc.LLParserGenerator.AndPred >.Overlaps().

◆ Remove()

bool Loyc.Collections.Impl.InternalSet< T >.Remove ( ref T  item,
IEqualityComparer< T >  comparer 
)
inline

Removes an item from the set.

Returns
true if the item was removed, false if it was not found.

Referenced by Loyc.Collections.Impl.InternalSet< KeyValuePair< K, V > >.ExceptWith(), and Loyc.Collections.Impl.InternalSet< KeyValuePair< K, V > >.SymmetricExceptWith().

◆ SetEquals() [1/2]

bool Loyc.Collections.Impl.InternalSet< T >.SetEquals ( IEnumerable< T >  other,
IEqualityComparer< T >  comparer,
int  myExactCount 
)
inline

Returns true if this set and the other set have the same items.

This method is costly if 'other' is not a set; a temporary set will be constructed to answer the query. Also, this overload has the same subtle assumption as the other overload.

◆ SetEquals() [2/2]

bool Loyc.Collections.Impl.InternalSet< T >.SetEquals ( ISet< T >  other,
int  myExactCount 
)
inline

Returns true if this set and the other set have the same items.

This implementation assumes that if the two sets use different definitions of equality (different IEqualityComparer<T>s), that neither set contains duplicates from the point of view of the other set. If this rule is broken–meaning, if either of the sets were constructed with the comparer of the other set, that set would shrink– then the results of this method are unreliable. If both sets use the same comparer, though, you have nothing to worry about.

Referenced by Loyc.Collections.Set< Loyc.LLParserGenerator.AndPred >.SetEquals().

◆ SymmetricExceptWith()

int Loyc.Collections.Impl.InternalSet< T >.SymmetricExceptWith ( IEnumerable< T >  other,
IEqualityComparer< T >  comparer,
bool  xorDuplicates = true 
)
inline

Modifies the current set to contain only elements that were present either in this set or in the other collection, but not both.

Parameters
xorDuplicatesControls this function's behavior in case 'other' contains duplicates. If xorDuplicates is true, an even number of duplicates has no overall effect and an odd number is treated the same as if there were a single instance of the item. Setting xorDuplicates to false is costly, since a temporary set is constructed in order to eliminate any duplicates. The same comparer is used for the temporary set as for this set.

Returns the change in set size (positive if items were added, negative if items were removed)

◆ Thaw()

Thaws a frozen root node by duplicating it, or creates the root node if the set doesn't have one.

Since InternalSet<T> is a structure rather than a class, it's not immediately obvious what the happens when you copy it with the '=' operator. The InternalList<T> structure, for example, it is unsafe to copy (in general) because as the list length changes, the two (or more) copies immediately go "out of sync" because each copy has a separate Count property and a separate array pointer–and yet they will share the same array, at least temporarily, which can produce strange results.

It is mostly safe to copy InternalSet instances, however, because they only contain a single piece of data (a reference to the root node), and the root node only changes in two situations:

  1. When the root node is null and you call Add or this method
  2. When the root node is frozen and you modify the set or call this method

In the second case, when you have frozen a set with CloneFreeze(), all existing copies are frozen, and further changes affect only the specific copy that you change. You can also call Thaw() if you need to make copies that are kept in sync, without actually modifying the set first.

This method has no effect if the root node is already thawed.

◆ UnionWith()

int Loyc.Collections.Impl.InternalSet< T >.UnionWith ( IEnumerable< T >  other,
IEqualityComparer< T >  thisComparer,
bool  replaceIfPresent 
)
inline

Adds the contents of 'other' to this set.

Parameters
thisComparerThe comparer for this set (not for 'other', which is simply enumerated).
replaceIfPresentIf items in 'other' match items in this set, this flag causes those items in 'other' to replace the items in this set.

Member Data Documentation

◆ DefaultComparer

readonly IEqualityComparer<T> Loyc.Collections.Impl.InternalSet< T >.DefaultComparer = typeof(IReferenceEquatable).IsAssignableFrom(typeof(T)) ? null : EqualityComparer<T>.Default
static

This is EqualityComparer<T>.Default, or null if T implements IReferenceEquatable.

◆ Empty

readonly InternalSet<T> Loyc.Collections.Impl.InternalSet< T >.Empty = new InternalSet<T> { _root = FrozenEmptyRoot() }
static

An empty set.

This property comes with a frozen, empty root node, which Set<T> uses as an "initialized" flag.