Skip to content

fit() and flat() broken for simple uses of filler= #1517

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zachriggle opened this issue May 26, 2020 · 8 comments
Closed

fit() and flat() broken for simple uses of filler= #1517

zachriggle opened this issue May 26, 2020 · 8 comments
Assignees
Milestone

Comments

@zachriggle
Copy link
Member

The flat and fit routines take an arbitrary iterable for the filler= argument. This seems to be broken in recent releases.

Pwntools 3.13.0

>>> fit({10: 'A'}, length=32, filler='_')
'__________A_____________________'

Pwntools 4.2.0 (works)

>>> fit({10: 'A'}, length=32, filler=[ord('_')])
b'__________A_____________________'

Pwntools 4.2.0 (broken, but should work)

>>> fit({10: 'A'}, length=32, filler='_')
---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
<ipython-input-1-304fe6d91d58> in <module>()
----> 1 fit({10: 'A'}, length=32, filler='_')

/home/pwntools/pwntools/pwnlib/util/packing.py in fit(*args, **kwargs)
    658 def fit(*args, **kwargs):
    659     """Legacy alias for `:func:flat`"""
--> 660     return flat(*args, **kwargs)
    661
    662 """

/home/pwntools/pwntools/pwnlib/context/__init__.py in setter(*a, **kw)
   1457         kw.setdefault('arch', 'none')
   1458         with context.local(**{k:kw.pop(k) for k,v in tuple(kw.items()) if isinstance(getattr(ContextType, k, None), property)}):
-> 1459             return function(*a, **kw)
   1460     return setter
   1461

/home/pwntools/pwntools/pwnlib/util/packing.py in flat(*args, **kwargs)
    647
    648     filler = iters.cycle(filler)
--> 649     out = _flat(args, preprocessor, make_packer(), filler)
    650
    651     if length:

/home/pwntools/pwntools/pwnlib/util/packing.py in _flat(args, preprocessor, packer, filler)
    549             val = _flat(arg, preprocessor, packer, filler)
    550         elif isinstance(arg, dict):
--> 551             filler, val = _fit(arg, preprocessor, packer, filler)
    552         elif isinstance(arg, bytes):
    553             val = arg

/home/pwntools/pwntools/pwnlib/util/packing.py in _fit(pieces, preprocessor, packer, filler)
    528         # Fill up to offset
    529         while len(out) < k:
--> 530             out += p8(next(filler))
    531
    532         # Recursively flatten data

/home/pwntools/pwntools/pwnlib/context/__init__.py in setter(*a, **kw)
   1457         kw.setdefault('arch', 'none')
   1458         with context.local(**{k:kw.pop(k) for k,v in tuple(kw.items()) if isinstance(getattr(ContextType, k, None), property)}):
-> 1459             return function(*a, **kw)
   1460     return setter
   1461

/home/pwntools/pwntools/pwnlib/util/packing.py in routine(number)
    338                 ("little", False):  lu,
    339                 ("big",    True  ): bs,
--> 340                 ("big",    False):  bu}[endian, signed](number)
    341
    342     routine.__name__ = name

/home/pwntools/pwntools/pwnlib/util/packing.py in routine(data)
    303
    304     def routine(data):
--> 305         return ops[op](fmt,data)
    306     routine.__name__ = routine.__qualname__ = name
    307

error: cannot convert argument to integer
@zachriggle
Copy link
Member Author

Following up, it looks like there were tests for this in Pwntools 3.13, but they were rewritten at some point. Pwntools 3.13 shows:

      >>> fit({12: 'XXXX'}, filler = 'AB', length = 20)
      'ABABABABABABXXXXABAB'

@zachriggle
Copy link
Member Author

It looks like these are the changes that broke things.

@@ -516,14 +520,14 @@ def _fit(pieces, preprocessor, packer, filler):
     filler = iters.chain(pad, filler)

     # Build output
-    out = ''
+    out = b''
     for k, v in sorted(pieces.items()):
         if k < len(out):
             raise ValueError("flat(): data at offset %d overlaps with previous data which ends at offset %d" % (k, len(out)))

         # Fill up to offset
         while len(out) < k:
-            out += filler.next()
+            out += p8(next(filler))

and

@@ -647,7 +651,7 @@ def flat(*args, **kwargs):
     if length:
         if len(out) > length:
             raise ValueError("flat(): Arguments does not fit within `length` (= %d) bytes" % length)
-        out += ''.join(filler.next() for _ in xrange(length - len(out)))
+        out += b''.join(p8(next(filler)) for _ in range(length - len(out)))

Removing the p8 doesn't fix the issue, but seems to be on the right track

@zachriggle
Copy link
Member Author

It looks like another part of the issue is the internal switch to use bytearray? I'm not sure, still trying to diagnose this

@zachriggle
Copy link
Member Author

zachriggle commented May 26, 2020

It looks like something underlying iters.cycle also changed significantly, which comes from itertools.cycle.

It seems the p8 above were added to handle this situation, which makes things sort-of work on Python3, and broken on Python2. This seems to be a core Py3-vs-Py2 issue that needs to be worked around.

Python2

>>> x = iters.cycle(b'asdf')
>>> next(x)
'a'
>>> next(x)
's'
>>> next(x)
'd'
>>> next(x)
'f'

Python3

With bytes

>>> x = iters.cycle(b'asdf')
>>> next(x)
97
>>> next(x)
115
>>> next(x)
100
>>> next(x)
102

With a string

>>> x = iters.cycle('asdf')
>>> next(x)
'a'
>>> next(x)
's'
>>> next(x)
'd'
>>> next(x)
'f'

@zachriggle
Copy link
Member Author

Honestly this is a Python3 gaffe and I think the easiest way to work around it is to hack the p8 routine to do the "magical correct thing" if the argument is a bytes-like object.

@Arusekk
Copy link
Member

Arusekk commented May 27, 2020

It looks like something underlying iters.cycle also changed significantly, which comes from itertools.cycle.

@zachriggle Well, just this iters.cycle 'bug' is a total WONTFIX, since this function has little right to object to python iteration mechanism. It would just not be consistent to treat some types special instead of just giving the impression of chain(iter(x), iter(x), iter(x), ...) uniformly for all types. Whatever a type yields while iterating, is yielded by cycling it.

As for the actual issue, though, p8 was just a choice of the moment (totally mea culpa), should be definitely changed to something more generic, like, say, flat(x)[0]? I'm open for ideas, but the following is to be considered.

So what if:

  • The iterable is yielding strings longer than 2? Could be resolved by something like lazy_flat(cycle(filler)), which would return a generator of uint8s or just unsupported.
  • The iterable is malicious, yielding empty strings all the time? Could be stated that this is not supported, or special-casing empty iterables / of known length 0 (before passing to cycle()? does next(cycle('')) hang forever?).
  • The iterable yields integers not in range(256)? Could be just unsupported or encoded like in bullet no. 1 (if so, what with endianness?) --- generally seems like a mess that needs special-casing and different treatment for <256 and the rest (we want to still support passing filler=b'_', right? ... right...?).
  • (possible more useful edge-case scenarios, but I did not think more).

But hey, we could also make a cute hack that makes p8 be like context._encode on strings!
EDIT: Actually not a chance I am personally implementing such a hack, although it's so tempting for a user to have a black box do things for them, isn't it.

@zachriggle
Copy link
Member Author

zachriggle commented May 27, 2020

Ultimately the issue comes down to:

Python3

>>> x = b'a'
>>> x[0] == x
False
>>> x[0]
97

Python2

>>> x = b'a'
>>> x[0] == x
True
>>> x[0]
'a'

So it looks like iterating over a byte-array is generating ints, in which case the p8 is necessary on Python3 but not necessary on Python2.

I'll spend some time seeing if I can come up with a more elegant solution or wrapper for this.

@zachriggle
Copy link
Member Author

It also occurs to me that using flat is a good way to ensure we always get a bytes-like object regardless of what is passed in, and might be a good thing to start using from e.g. tubes instead of .encode() and .decode().

@heapcrash heapcrash added this to the Someday milestone Jun 22, 2020
heapcrash added a commit to heapcrash/pwntools that referenced this issue Jun 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants