Automatic Field Extraction of Extended TLV for Binary Protocol Reverse




Huang, Zewen

Journal Title

Journal ISSN

Volume Title



Type Length Value (TLV) is one of the main structures commonly used in network protocols. A large number of proprietary protocols, whose specification is unknown to the public, run in the current Internet as well as domain-specific Internet of Things (IoT) applications. It is critical to infer the TLV fields within a packet because this information can help network administrators quickly identify abnormal traffic and potential attacks. Inferring TLV fields belongs to the general task of protocol reverse engineering and is particularly challenging for binary protocols, where the boundaries of TLV fields have many possible positions. Existing methods for reverse engineering binary protocols involve many parameters and only work for protocols strictly following the conventional TLV format. We extend the concept of TLV to accommodate a broader category of structural patterns in various binary protocols, such as TCP, IP, ModBus, and MQTT. We then design algorithms to automatically extract the extended-TLV fields from packets. Via a series of experiments over several protocols, we demonstrate that our algorithms can accurately and quickly identify the extended-TLV fields in all the tested protocols. Our approach can thus be deployed as a general method for automatically reverse engineering binary protocol format.