Skip to content

Various bugfixes / enhancements for <flat_set> #3993

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

achabense
Copy link
Contributor

@achabense achabense commented Aug 26, 2023

Works towards #2912.

Replace #3985; the commits are reorganized for ease of review.
Major changes are:

  1. Fix usage of _Compressed_pair; the implementation was erroneously trying to compress the container instead of key_compare.
  2. Implement LWG-3884 (flat_meow(flat_meow&&, al)).
  3. Implement missing insert(initializer_list), and make operator=(initializer_list) available for users of flat_meow.
  4. Fix emplace[_hint] and insert([hint,]val) functions.
    4.1. Some functions couldn't compile, including insert(const key&) and hint functions.
    4.2. Some function signatures were non-conformant.
    4.3. The bound searching algorithm in _Emplace_hint was wrong. flat_set must find lower_bound for correctness, and flat_multiset should find upper_bound for efficiency and conformance to the standard.
  5. Various other cleanups and enhancements.

Thanks to @duanqn and @frederick-vs-ja for previous review on #3985 😽

3.2. Make constructor declarations strictly in line with the standard
3.3. Various cleanups or enhancements for constructors
3.4. Expose `operator=(initializer_list)`
6.2. Fix signature for `insert(iter, iter)`
6.3. Implement `insert(initializer_list)`
6.4. Make declarations strictly in line with the standard
@@ -132,9 +119,97 @@ void test_constructors() {
flat_set<int> a{};
a = {1, 7, 7, 7, 2, 100, -1};
assert_all_requirements_and_equals(a, {-1, 1, 2, 7, 100});
assert_all_requirements_and_equals(flat_set<int>(a, allocator<int>{}), {-1, 1, 2, 7, 100});
assert_all_requirements_and_equals(flat_set<int>(std::move(a), allocator<int>{}), {-1, 1, 2, 7, 100});
flat_multiset<int> b{};
b = {1, 7, 7, 7, 2, 100, -1};
Copy link
Contributor Author

@achabense achabense Aug 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was passing because the a flat_multiset was constructed from initializer-list and then operator=(flat_multiset&&) was called (so the old allocator might be forgotten). Now it's really operator=(initializer_list).

}

auto insert(const value_type& _Val) {
return _Insert<_Kty>(_Val);
return _Emplace(_Val);
Copy link
Contributor Author

@achabense achabense Aug 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would fail to compile before the change, as _Insert<_Kty>(_Val) was actually _Insert(_Kty&&).

flat_set<int> st;
const int a = 1;
st.insert(a);

} else {
// _Val < (*_Where - 1)
_Where = _STD upper_bound(_Begin, _Where, _Val, _Compare);
Copy link
Contributor Author

@achabense achabense Aug 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After making _Emplace_hint compilable with emplace_hint functions (it was using void as return type and tried to return const_iterator as iterator in a branch), I realized that this line was making the following test fail (fixed now):

flat_set<int, lt, C> a{0, 5};
assert_all_requirements_and_equals(a, {0, 5});
a.emplace_hint(a.end());
assert_all_requirements_and_equals(a, {0, 5}); // < got {0, 0, 5}!

stl/inc/flat_set Outdated
_STL_INTERNAL_STATIC_ASSERT(_Keylt_transparent && is_constructible_v<_Kty, _Ty>);
_Kty _Keyval{_STD forward<_Ty>(_Val)};
_STL_ASSERT(lower_bound(_Keyval) == _Where && !_Keys_equal(_Keyval, *_Where),
"find(val) was not equal to find(key_type{forward(val)})");
Copy link
Contributor Author

@achabense achabense Aug 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion is to avoid invariant breakage. It can capture errorneous usages like:

std::flat_set<int, std::less<>> s{ 1,2,3,5,7 };
s.insert(2.5);

Which is UB, as find(2.5)->end, but find(int(2.5))->2.
I'm still highly uncertain what this precondition implies to the implementation. Is flat_set::insert([hint,]auto&&) really well-defined by the standard?
image

Copy link
Contributor Author

@achabense achabense Aug 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, I'm assuming that lower_bound(val)==lower_bound(key{forward(val)}). Can I make that assumption? If so, then how is this guaranteed by the standard? If not, then how should hint be used?

Copy link
Contributor Author

@achabense achabense Aug 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, for the first overload insert(K&& x), the bool cannot indicate whether x is moved if it was rvalue, as it's unclear (to me) whether the emplace is supposed to succeed. !contains(val) implies !contains(key(val)) by the standard, so the insert is suppose to succeed then.

Copy link
Contributor Author

@achabense achabense Aug 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current understanding is that, the acceptable types should be equivalent in a sense as if comparing the same thing (for example, there are foo::key, bar::key and baz::key, and the transparent comparator should compare them as if by comparing the key.) So lower_bound(val)==lower_bound(key{forward(val)}) should really hold true. (however)

if constexpr (!_Multi) {
_STL_ASSERT(_Is_unique(), "Input was not unique!");
_STL_ASSERT(_Is_unique(), "Input was sorted but not unique!");
Copy link
Contributor Author

@achabense achabense Aug 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If just for debug efficiency, is_sorted && _Is_unique can be replaced by checking [0] lt [1] lt [2] ... in a single pass.

const iterator _Old_end = begin() + static_cast<difference_type>(_Old_size);
const iterator _New_end = end();
void _Restore_invariants_after_insert(const size_type _Old_size) {
const key_compare& _Compare = _Get_comp();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made all key_compare& to const key_compare&, as I believe there shouldn't be any difference between operator(...)& and operator(...)const& (but I cannot find a formal citation for this constraint :(

a.insert(a.begin(), 6);
assert_all_requirements_and_equals(a, {0, 0, 0, 5, 6});
a.insert(a.begin(), _dat[1]);
assert_all_requirements_and_equals(a, {0, 0, 0, 1, 5, 6});
Copy link
Contributor Author

@achabense achabense Aug 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert_all_requirements_and_equals is making the test hard to debug, as it holds asserts within itself, which will always report fixed lines. I think we should improve debug information (at least showing line number).

void assert_all_requirements_and_equals(const T& s, const initializer_list<typename T::value_type>& il) {
assert_container_requirements(s);
assert_reversible_container_requirements(s);
auto val_comp = s.value_comp();
auto begin_it = s.cbegin();
auto end_it = s.cend();
assert(std::is_sorted(begin_it, end_it, val_comp));

@@ -152,6 +170,7 @@ public:
_NODISCARD const_iterator end() const noexcept {
return _Get_cont().end();
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some newlines are added/removed for better visual effect & to group the functions roughly in the same way as the standard.

@@ -135,11 +152,12 @@ public:
: _Base_flat_set(_Tsort, container_type(_Ilist.begin(), _Ilist.end(), _Al)) {}

_Deriv& operator=(initializer_list<_Kty> _Ilist) {
_Get_cont() = container_type(_Ilist.begin(), _Ilist.end());
_Get_cont().assign(_Ilist.begin(), _Ilist.end());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe assign is the right thing here, as it respects the container's allocator and capacity.

const iterator _End = end();
const iterator _Where = lower_bound(_Val);
const const_iterator _End = cend();
const const_iterator _Where = lower_bound(_Val);
if (_Where != _End && _Keys_equal(*_Where, _Val)) {
Copy link
Contributor Author

@achabense achabense Aug 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of usages of _Keys_equal can be simplified to a single comparision, as it is used after lower_bound (so it's enough to test _Val >= *_Where for equality). However I think we'd better introduce these changes as enhancements in future prs.

Update: I think it's ok to update some obvious cases in this pr. (Implemented in the 3rd commit df36cb7 in this comment (not yet pushed to this pr) )

@@ -164,6 +183,7 @@ public:
_NODISCARD const_reverse_iterator rend() const noexcept {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For rbegin(), I think it's not necessary to call _Get_cont().rend(), as it is required to be reverse_iterator(begin()), and the container does not necessarily offer rbegin() method (am I mistaken?). The same goes for rend(), crbegin() and crend().

Copy link
Contributor

@frederick-vs-ja frederick-vs-ja Aug 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the container does not necessarily offer rbegin() method

No, rbegin must be provided since flat_set/flat_multiset requires the adapted container to be a reversible container ([flat.set.overview]/2, [flat.multiset.overview]/2).

Copy link
Contributor Author

@achabense achabense Aug 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, that's the requirements for flat_meow; as to the container to adapt, the overview also says "Any sequence container supporting Cpp17RandomAccessIterator can be used to instantiate flat_meow" (ref), and I think supporting Cpp17RandomAccessIterator doesn't require the container should also be reversible (ref).

Copy link
Contributor

@frederick-vs-ja frederick-vs-ja Aug 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. I was just confused. The requirement on being a reversible container is always additionally specified and not implied by other requirements.

@achabense
Copy link
Contributor Author

Also, I'm having a sense that a lot of functions are still not rubost against exceptions. We may need more usages _Clear_scope_guard here and there. (I think we need a single issue/pr for this problem.)

T st{};
T& ref = st;
const T& cref = st;

Copy link
Contributor Author

@achabense achabense Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think declval will hurt readability; but it seems readability is not very important in tests; should I use declval instead?

( And I feel this noexcept test is off-topic. Most of tests added in this pr are ad-hoc ones to prove bugfixes, and I‘m especially dissatisfied with test_insert_1/2 tests as they look fragile and verbose, and lack coverage of corner cases. I think a lot of things in test.cpp should be rewritten in the future. )

Update: I've removed this part; I think this pr shouldn't introduce irrelevant test coverage, and we need separate prs to overhaul the tests, including those introduced in this pr.

@CaseyCarter CaseyCarter added the bug Something isn't working label Aug 29, 2023
@StephanTLavavej StephanTLavavej added the flat_meow C++23 container adaptors label Aug 29, 2023
9.1. Use direct-non-list-initialization in emplace functions
9.1 Remove irrelevant noexcept test; fix or improve some tests
@achabense
Copy link
Contributor Author

achabense commented Aug 30, 2023

Followup to this comment. I find insert([hint,]auto&&) still very confusing.

1.std::set also have this set of overload. However it seems there is no according implementation in the library.
https://eel.is/c++draft/set.modifiers
https://github.com/microsoft/STL/blob/main/stl/inc/xtree
I didn't realize that's c++26 feature...

  1. I'm highly uncertain whether the following code is well-defined according to the standard (even if the precondition find(confusing{1})==find(int{confusing{1}})==end is met here); and if so, how should the hint be possibly used. The current implementation is rejecting this in debug mode and will break invariant in release mode.
#include <flat_set>

struct confusing {
    int val;
    operator int()const { return val - 10; }
};

struct lt {
    bool operator()(const int& a, const int& b)const { return a < b; }
    bool operator()(const int& a, const confusing& b)const {
        return a < (b.val + 10);
    }
    bool operator()(const confusing& a, const int& b)const {
        return (a.val + 10) < b;
    }
    bool operator()(const confusing& a, const confusing& b)const { return a.val < b.val; }

    using is_transparent = int;
};

int main() {
    std::flat_set<int, lt> a;
    a.insert(1);
    a.insert(a.end(), confusing{ 1 });
}

@achabense
Copy link
Contributor Author

achabense commented Aug 31, 2023

Here are some more commits I'm hesitant to push:

  1. 312c496 More conservative insert(auto&&) that allows cases shown above. (It seems that, for cases where lower_bound(val) cannot always be lower_bound(key(val)) (like the above case), the time complexity of insert(hint,auto&&) cannot be amortized constant, as at least one hintless lower_bound search is required, either to decide ?contains(val) or to find lower_bound(key(val)) for insertion position.)

  2. 68bcb91 Support rbegin/... even if the container doesn't support them. (or, should we use _RANGES rbegin/... instead? should we also prefer _RANGES algorithms to their std versions?) not planned in this pr.

  3. df36cb7 convert some _Keys_equal to a single comparision. (There are more candidates, but these are the most obvious ones. Only _Erase_dupes_if_needed actually needs _Key_equal.)

--update--
Also, there is currenctly no tests for flat_set::insert([hint,]auto&&). Should I also add tests like this (as test_insert_3)?

#include<flat_set>
#include<string>
#include<cassert>

struct foo {
    std::string key;
};

struct bar {
    std::string key;
    operator foo()const& { return { key }; }
    operator foo()&& { return { std::move(key) }; }
};

struct comp {
    bool operator()(const auto& l, const auto& r)const {
        return l.key < r.key;
    }

    using is_transparent = int;
};

int test() {
    std::flat_set<foo, comp> st;
    st.insert(foo{ "123" });
    bar b{ "123" };
    st.insert(std::move(b));
    assert(b.key == "123"); // not moved
}

@StephanTLavavej StephanTLavavej self-assigned this Sep 2, 2023
@@ -187,33 +208,36 @@ public:
return _Get_cont().max_size();
}

// modifiers
template <class... _Args>
auto emplace(_Args&&... _Vals) {
Copy link
Contributor Author

@achabense achabense Sep 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: I think _Vals is slightly confusing (as insertion functions are naming key_type as _Val). Better be (_Types&&... _Args).

_Keylt& _Compare = _Get_comp();
const iterator _Begin = begin();
const iterator _End = end();
auto _Emplace(_Ty&& _Val) {
Copy link
Contributor Author

@achabense achabense Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_Emplace[_hint] is used for single-element insertion(key_type and auto&&, just like insert functions), but uses container.emplace for performance guarantee. I'm not sure whether it's more suitable to name as _Insert[_hint]...

@StephanTLavavej StephanTLavavej removed their assignment Sep 8, 2023
@StephanTLavavej
Copy link
Member

Thanks! I pushed changes for minor nitpicks and will merge after the tests pass.

I noticed one mostly-pre-existing issue - there's code that's moving from a container's comparator, but this has the potential to break container invariants. Notably, moving from a std::function can do this (the moved-from std::function is left in a valid-but-unspecified state, and in our implementation we empty it out).

IIRC implementations vary on how they handle this case (which the Standard really should provide better clarity on). In the associative containers and container adaptors, we've chosen to copy from comparators to avoid this problem. It seems that flat_meow should do the same as priority_queue etc.

@StephanTLavavej StephanTLavavej merged commit 1b1e241 into microsoft:feature/flat_set Sep 8, 2023
@StephanTLavavej
Copy link
Member

🥞 😸 🎉

@achabense achabense deleted the _For_flat_set_v2 branch September 9, 2023 01:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flat_meow C++23 container adaptors
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants