< Previous Page Return to Title Page Next Page >

Recognizing image/MP3 trollers

  • Every few months, another genius comes up with the same brilliant business plan:
    • Automatically patrol Web sites for copyright infringement!
    • Copyright holders will pay us millions! We'll be rich!

    •  
  • Of course, it doesn't work, because:
    • Yield is low
    • Incidence of false positives is high
    • Copyright holders seldom realize any new revenue from the practice, so their willingness to pay is limited

    •  
  • Trolling can play hob with Web servers, however, because
    • Image and audio files are often big
    • Trollers usually open many connections at once
    • robots.txt etiquette intentionally ignored

    •  
  • Solution: Detect mass access via
    • Access patterns (will usually grab html files first, then come back to grab images and audio en masse and faster than humanly possible)
    • HTTP_USER_AGENT
    • Domain names (e.g. "imagelock.com")

    •  
  • Blackhole and complain to ISP