Feature extraction is critical for TLS traffic analysis using machine
learning techniques, which it is also very difficult and time-consuming
requiring huge engineering efforts. We designed and implemented DeepTLS, a
system which extracts full spectrum of features from pcaps across meta,
statistical, SPLT, byte distribution, TLS header and certificates. The backend
is written in C++ to achieve high performance, which can analyze a GB-size pcap
in a few minutes. DeepTLS was thoroughly evaluated against two state-of-the-art
tools Joy and Zeek with four well-known malicious traffic datasets consisted of
160 pcaps. Evaluation results show DeepTLS has advantage of analyzing large
pcaps with half analysis time, and identified more certificates with acceptable
performance loss compared with Joy. DeepTLS can significantly accelerate
machine learning pipeline by reducing feature extraction time from hours even
days to minutes. The system is online at https://deeptls.com, where test
artifacts can be viewed and validated. In addition, two open source tools
Pysharkfeat and Tlsfeatmark are also released.

