a crawler for wallstreetcn,finance.sina by Scrapy
This is the crawler part of my graduation project which is a financial infomation search engine.
It crawled three websites by using Scrapy as follows:
http://wallstreetcn.com/news,
http://finance.sina.com.cn/,
http://news.10jqka.com.cn/
then send the data to solr server to build index. And it stored the data in MySQL for backup.