TY - CHAP
T1 - Efficiently Detecting Web Spambots in a Temporally Annotated Sequence
AU - Alamro, Hayam
AU - Iliopoulos, Costas S.
AU - Loukides, Grigorios
PY - 2020/9/5
Y1 - 2020/9/5
N2 - Web spambots are becoming more advanced, utilizing techniques that can defeat existing spam detection algorithms. These techniques include performing a series of malicious actions with variable time delays, repeating the same series of malicious actions multiple times, and interleaving legitimate (decoy) and malicious actions. Existing methods that are based on string pattern matching are not able to detect spambots that use these techniques. In response, we define a new problem to detect spambots utilizing the aforementioned techniques and propose an efficient algorithm to solve it. Given a dictionary of temporally annotated sequences hat modeling spambot actions, each associated with a time window, a long, temporally annotated sequence T modeling a user action log, and parameters f and k, our problem seeks to detect each sequence in hat that occurs in T at least f times within its associated time window, and with at most k mismatches. Our algorithm solves the problem exactly, it requires linear time and space, and it employs advanced data structures and the Kangaroo method, to deal with the problem efficiently.
AB - Web spambots are becoming more advanced, utilizing techniques that can defeat existing spam detection algorithms. These techniques include performing a series of malicious actions with variable time delays, repeating the same series of malicious actions multiple times, and interleaving legitimate (decoy) and malicious actions. Existing methods that are based on string pattern matching are not able to detect spambots that use these techniques. In response, we define a new problem to detect spambots utilizing the aforementioned techniques and propose an efficient algorithm to solve it. Given a dictionary of temporally annotated sequences hat modeling spambot actions, each associated with a time window, a long, temporally annotated sequence T modeling a user action log, and parameters f and k, our problem seeks to detect each sequence in hat that occurs in T at least f times within its associated time window, and with at most k mismatches. Our algorithm solves the problem exactly, it requires linear time and space, and it employs advanced data structures and the Kangaroo method, to deal with the problem efficiently.
KW - Action logs
KW - Temporally annotated sequence
KW - Web spambot
UR - http://www.scopus.com/inward/record.url?scp=85083737105&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-44041-1\_87
DO - 10.1007/978-3-030-44041-1\_87
M3 - Conference paper
AN - SCOPUS:85083737105
SN - 9783030440404
VL - 1151
T3 - Advances in Intelligent Systems and Computing
SP - 1007
EP - 1019
BT - Advanced Information Networking and Applications - Proceedings of the 34th International Conference on Advanced Information Networking and Applications, AINA-2020, Caserta, Italy, 15-17 April
A2 - Barolli, Leonard
A2 - Amato, Flora
A2 - Moscato, Francesco
A2 - Enokido, Tomoya
A2 - Takizawa, Makoto
PB - Springer
T2 - 34th International Conference on Advanced Information Networking and Applications, AINA 2020
Y2 - 15 April 2020 through 17 April 2020
ER -