Efficiently Detecting Disguised Web Spambots (with Mismatches) in a Temporally Annotated Sequence

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

42 Downloads (Pure)

Abstract

Web spambots are becoming more advanced, utilizing techniques that can defeat existing spam detection algorithms. These techniques include performing a series of malicious actions with variable time delays, repeating the same series of malicious actions multiple times, and interleaving legitimate (decoy) and malicious actions. Existing methods that are based on string pattern matching are not able to detect spambots that use these techniques. In response, we define a new problem to detect spambots utilizing the aforementioned techniques and propose an efficient algorithm to solve it. Given a dictionary of temporally annotated sequences $overline{S}$ modeling spambot actions, each associated with a time window, a long, temporally annotated sequence $T$ modeling a user action log, and parameters $f$ and $k$, our problem seeks to detect each degenerate sequence $Tilde{S}$ with $c$ indeterminate action(s) in $overline{S}$ that occurs in $T$ at least $f$ times within its associated time window, and with at most $k$ mismatches. Our algorithm solves the problem exactly, it requires linear time and space, and it employs advanced data structures, bit masking, and the Kangaroo method, to deal with the problem efficiently.
Original languageEnglish
Title of host publicationPATTERNS 2020, The Twelfth International Conference on Pervasive Patterns and Applications
Subtitle of host publicationNice, France, October 25 - 29, 2020
PublisherIARIA
Pages50 - 57
ISBN (Electronic)978-1-61208-783-2
Publication statusPublished - 26 Apr 2020

Keywords

  • Web spambot
  • Indeterminate
  • Disguised
  • Actions log

Fingerprint

Dive into the research topics of 'Efficiently Detecting Disguised Web Spambots (with Mismatches) in a Temporally Annotated Sequence'. Together they form a unique fingerprint.

Cite this