JOURNAL OF BEIJING UNIVERSITY OF POSTS AND TELECOM ›› 2019, Vol. 21 ›› Issue (5): 85-92.doi: 10.19722/j.cnki.1008-7729.2019.0064

Previous Articles     Next Articles

Nutch-based Multi-source Social Media Intelligence Collection System

  

  1. School of Economics, Wuhan University of Technology, Wuhan 430070, China
  • Online:2019-10-31

Abstract: Taking Internet social media platforms such as news websites, BBS, post bars, microblogs, etc. as the research objects, and based on the domain modeling, intelligence collection process design and content analysis of each platform, an intelligence collection system suitable for the whole social media platform is designed based on open source web crawler Nutch. According to the characteristics of each platform, methods of classification ranking, block analysis and simulated login are applied to the collection of news, BBS, post bars, and microblogs, which improves versatility and cost performance of the system, and achieves efficient collection of multi-source social media intelligence.

Key words: Nutch, social media intelligence, multi-source intelligence collection, content analysis, simulated login

CLC Number: