The repo provides as a starting point to stream tweet with kafka streaming
- Install kakfa on your system
- Configure kafka to start both the zookeeper and kafka
bin/ config/ bin/ config/
- Create a topic in kafka to store the results of streaming twitter for example: MSDhoni
bin/ --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic msdhoni
- Install python library to start the streaming processes with python
pip install kafka-python pip install python-twitter pip install tweepy
- Set the environment variable from your twitter account authorize it to access twitter
export TWEET_ACCESS_TOKEN = <access token> export TWEET_ACCESS_SECRET = <access secret token> export TWEET_CONSUMER_KEY = <consumer key/ API key> export TWEET_CONSUMER_SECRET = <consumer secret / API access secret token >
- Run the starter file to collect your streaming data
- View the streaming logs from kafka consumer
bin/ --bootstrap-server localhost:9092 --topic msdhoni --from-beginning