Automatic Field Extraction of Extended TLV for Binary Protocol Reverse
Date
2022-12-22
Authors
Huang, Zewen
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Type Length Value (TLV) is one of the main structures commonly used in network
protocols. A large number of proprietary protocols, whose specification is unknown to
the public, run in the current Internet as well as domain-specific Internet of Things
(IoT) applications. It is critical to infer the TLV fields within a packet because
this information can help network administrators quickly identify abnormal traffic
and potential attacks. Inferring TLV fields belongs to the general task of protocol
reverse engineering and is particularly challenging for binary protocols, where the
boundaries of TLV fields have many possible positions. Existing methods for reverse
engineering binary protocols involve many parameters and only work for protocols
strictly following the conventional TLV format. We extend the concept of TLV to
accommodate a broader category of structural patterns in various binary protocols,
such as TCP, IP, ModBus, and MQTT. We then design algorithms to automatically
extract the extended-TLV fields from packets. Via a series of experiments over several
protocols, we demonstrate that our algorithms can accurately and quickly identify the
extended-TLV fields in all the tested protocols. Our approach can thus be deployed
as a general method for automatically reverse engineering binary protocol format.